计算机应用

• 人工智能与仿真 •    下一篇

BIGDATA+207 大规模标签图中的动态Top-K兴趣子图查询

宋宝燕1,贾春杰2,单晓欢2,丁琳琳2,丁兴艳1   

  1. 1. 辽宁大学 信息学院, 沈阳 110036
    2. 辽宁大学
  • 收稿日期:2017-09-30 发布日期:2017-09-30 出版日期:2017-10-09
  • 通讯作者: 单晓欢
  • 作者简介:宋宝燕(1965—),女,辽宁开原人,教授,博士,CCF高级会员,主要研究方向:数据库、RFID事件流处理、大数据管理、图数据管理; 贾春杰(1991—),女,河北廊坊人,硕士研究生,主要研究方向:图数据管理; 单晓欢(1987—),女,辽宁沈阳人,助理实验师,博士研究生,主要研究方向:数据库、图数据管理; 丁琳琳(1983- ),女,辽宁阜新人,副教授,博士,CCF会员,主要研方向:大数据管理、分布式数据管理、不确定数据管理; 丁兴艳(1991—),女,贵州贵阳人,硕士研究生,主要研究方向:图数据管理。
  • 基金资助:

    国家自然科学基金资助项目(61472169, 61502215);国家重点研发计划项目(2016YFC0801406);辽宁省教育厅一般项目(L2015193);辽宁省博士科研启动基金项目(201501127)。

Dynamic Top-K interesting subgraph query on large-scale labeled graph

  • Received:2017-09-30 Online:2017-09-30 Published:2017-10-09
  • About author:SONG Baoyan, born in 1965. Ph. D., professor, the senior member of CCF. Her research interests include database techniques, RFID event stream processing techniques, big data management and graph data management. JIA Chunjie, born in 1991. M. S. candidate. Her research interests include graph data management. SHAN Xiaohuan, born in 1987. Dr., assistant experimentalist. Her research interests include database techniques and graph data management. DING Linlin, born in 1983. Ph. D., associate professor. Her research interests include big data management. distributed data management,uncertain data management. DING Xingyan, born in 1991. M.S.. Her research interests include graph data management.
  • Supported by:

    This work is supported by the National Natural Science Foundation of China (61472169, 61502215), the National Key Research and Development Program of China (2016YFC0801406), Science Research Normal Fund of Liaoning Province Education Department (L2015193), Doctoral Scientific Research Start Foundation of Liaoning Province (201501127).

摘要:

针对传统算法由于时间或空间复杂度过高而难以实现标签图规模增大且动态变化情况下的Top-K子图查询问题,提出一种适用于大规模标签图的动态Top-K兴趣子图查询方法。该方法建立包括节点拓扑结构特性和边特性的图拓扑结构特性索引(GTSF),利用该索引可有效剪枝过滤不满足限制条件的无效节点及边;基于GTSF索引提出多因素候选集过滤策略,通过对查询图候选集进一步剪枝以获得较少的候选集;提出的Top-K兴趣子图匹配验证方法将分为初始匹配和动态修正两个阶段,尽可能保证查询结果的实时、准确。大量实验结果表明,提出方法相比RAM、RWM算法的索引创建时间较短且占用空间较少,能有效处理大规模标签图中的动态Top-K兴趣子图查询。

关键词: 大规模标签图, 动态Top-K, 兴趣子图, 子图查询

Abstract:

The traditional algorithms are difficult to implement the Top-K subgraph query on large-scale dynamic  labeled graph due to high time or space complexity. For this reason, we propose a dynamic Top-K interesting subgraph query method. It  establishes graph topology feature index (GTSF) including node topology feature index and edge feature index, which can effectively prune and filter the invalid nodes and edges. The multi-factor candidate set filtering strategy is proposed based on GTSF index, which can be further pruned to obtain fewer candidate sets. The proposed matching-verification method will be divided into initial matching and dynamic correction, which try to ensure the real-time and accuracy of the query results. Experimental results show that the proposed method is faster to build indexes and occupies less space than the RAM and RWM. Our method can effectively deal with dynamic Top-K interesting subgraph query on large-scale labeled graph and has certain practical significance.

Key words: large-scale labeled graph, dynamic Top-K, interesting subgraph, subgraph query

中图分类号: