计算机应用 ›› 2010, Vol. 30 ›› Issue (12): 3391-3396.

• 数据库与数据挖掘 • 上一篇    下一篇

基于Apriori图挖掘算法的优化及其在3D构造解析的应用

陈立宁1,罗可2   

  1. 1. 长沙市开福区潮宗街
    2.
  • 收稿日期:2010-05-31 修回日期:2010-07-29 发布日期:2010-12-22 出版日期:2010-12-01
  • 通讯作者: 陈立宁
  • 基金资助:
    基于图形处理器的高性能计算;湖南省科技计划项目基金

Fast AGM algorithm and application to three-dimensional structure analysis

  • Received:2010-05-31 Revised:2010-07-29 Online:2010-12-22 Published:2010-12-01
  • Contact: Ning LiCHEN

摘要: 基于Apriori的图挖掘(Apriori-based Graph Mining,AGM)算法结构简单,以递归统计为基础,但在面临庞大图数据集时,由于存在子图同构问题,在生成候选子图时容易产生很多冗余子图,增大了计算时间的开销。因此在AGM算法基础上提出一种改进方法,通过增加约束来减少候选子图生成数量,同时引入三次元坐标对图的顶点间的距离进行计算,并归结到边的标识当中,以处理三维图结构数据。通过改进算法对化学化合物进行分析,描述其三维化学结构以及生理活性上的相互关系,并测试了不同条件下改进方法的时间开销,实验结果表明在边标识数较多的情况下改进算法比原算法缩短了计算时间,提高了效率。

关键词: 基于Apriori的图挖掘算法, 子图同构, 图结构数据, 三维坐标, 生理活性

Abstract: Apriori-based Graph Mining (AGM) algorithm is simple and it is based on recursion statistics. When graph data set is very large, due to sub-graph isomorphism problem, so many redundant sub-graphs would be generated while generating candidate sub-graphs, which increases computation time. In this paper, an improved method was proposed to reduce redundant sub-graph candidates by adding extra constraints and use three-dimensional coordinate to calculate the distance between each vertex of a graph, which was added to the edge label for handling three-dimensional graph structured data. In this paper, chemical compounds were analyzed by the improved algorithm to describe their three-dimensional chemical structure and correlation with physiological activity and the computation time on different conditions were examined. The experimental results prove that the improved algorithm cuts down the computation time with more edge labels and improves the efficiency.

Key words: Apriori-based Graph Mining (AGM) algorithm, sub-graph isomorphism, graph structured data, three-dimensional coordinate, physiological activity