计算机应用 ›› 2019, Vol. 39 ›› Issue (1): 39-45.DOI: 10.11772/j.issn.1001-9081.2018071593

• 2018年全国开放式分布与并行计算学术年会(DPCS 2018)论文 • 上一篇    下一篇

基于索引的子图查询技术研究进展

施炜杰1, 董一鸿1, 王雄1, 潘剑飞1,2   

  1. 1. 宁波大学 信息科学与工程学院, 浙江 宁波 315211;
    2. 北京百度在线科技有限公司, 北京 100084
  • 收稿日期:2018-07-19 修回日期:2018-08-09 出版日期:2019-01-10 发布日期:2019-01-21
  • 通讯作者: 董一鸿
  • 作者简介:施炜杰(1993-),男,浙江宁波人,硕士研究生,主要研究方向:数据挖掘、机器学习;董一鸿(1969-),男,浙江宁波人,教授,博士,主要研究方向:大数据、数据挖掘、人工智能;王雄(1993-),男,江苏泰州人,硕士研究生,主要研究方向:大数据、数据挖掘;潘剑飞(1991-),男,安徽黄山人,工程师,硕士,主要研究方向:大数据、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61572266);浙江省自然科学基金资助项目(LY16F020003);宁波市自然科学基金资助项目(2017A610114)。

Research progress of index-based subgraph query technology

SHI Weijie1, DONG Yihong1, WANG Xiong1, PAN Jianfei1,2   

  1. 1. College of Information Science and Engineering, Ningbo University, Ningbo Zhejiang 315211, China;
    2. Baidu Online Technology Company Limited, Beijing 100084, China
  • Received:2018-07-19 Revised:2018-08-09 Online:2019-01-10 Published:2019-01-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61572266), the Natural Science Foundation of Zhejiang Province (LY16F020003), the Natural Science Foundation of Ningbo City (2017A610114).

摘要: 图作为表示实体间的数据结构,在社区发现、生物化学分析、社会安全分析等数据关联性要求较高的领域有着广泛的应用。对于大规模数据下进行实时的图查询问题,通过构建合适的索引可以有效降低查询响应时间,提高查询精确度。首先介绍基于索引的子图查询算法的基本结构;然后按索引的构建方式将主流算法分为基于枚举的方法和基于频繁模式挖掘的方法两大类,分别从索引特征、索引结构、应用数据集等方面进行介绍和分析;最后对基于索引的子图查询算法面临的主要问题进行总结和分析,阐述了最新的分布式系统下图查询技术,并对未来趋势进行展望。

关键词: 子图同构, 索引, 子图查询, 频繁模式

Abstract: As a type of data structure representing entities, graphs are widely used in fields that have high requirements on data relevance, such as community data discovery, biochemical analysis and social security analysis. Focusing on the issue of real-time graph query operation under large-scale data, building a suitable index can effectively reduce query response time and improve query accuracy. The basic structure of index-based subgraph query algorithm was firstly introduced and then state-of-the-art algorithms were divided into two categories by construction method of index:enumeration construction and frequent pattern mining construction. Then these algorithms were introduced and analyzed from three aspects:index features, index structures and application datasets. Finally, main problems toward index-based subgraph query algorithm were summarized and analyzed, the latest query technology based on the distributed system was briefly described, and the future trend was forecasted.

Key words: subgraph isomorphism, index, subgraph query, frequent pattern

中图分类号: