计算机应用 ›› 2016, Vol. 36 ›› Issue (7): 1998-2005.DOI: 10.11772/j.issn.1001-9081.2016.07.1998

• 大数据 • 上一篇    下一篇

基于参考节点嵌入的图可达性查询

温菊屏1,2, 胡小生1,2, 林冬梅2,3, 曾亚光1   

  1. 1. 佛山科学技术学院 电子与信息工程学院, 广东 佛山 528000;
    2. 佛山市计算机学会, 广东 佛山 528000;
    3. 佛山科学技术学院 信息与教育技术中心, 广东 佛山 528000
  • 收稿日期:2015-12-07 修回日期:2016-03-15 出版日期:2016-07-10 发布日期:2016-07-14
  • 通讯作者: 温菊屏
  • 作者简介:温菊屏(1979-),女,江西赣州人,讲师,硕士,主要研究方向:数据挖掘、虚拟现实;胡小生(1978-),男,湖北黄冈人,讲师,硕士,主要研究方向:机器学习、数据挖掘;林冬梅(1969-),女,黑龙江肇东人,教授,硕士,CCF会员,主要研究方向:组合优化、智能计算;曾亚光(1975-),男,湖南湘潭人,副教授,博士,主要研究方向:生物光子学、光学投影成析。
  • 基金资助:
    国家自然科学基金资助项目(11474053),广东高校优秀青年创新人才培养计划项目(2013LYM_0097,2014KQNCX184)。

Graph reachability query based on reference node embedding

WEN Juping1,2, HU Xiaosheng1,2, LIN Dongmei2,3, ZENG Yaguang1   

  1. 1. School of Electronic and Information Engineering, Foshan University, Foshan Guangdong 528000, China;
    2. Foshan City Computer Society, Foshan Guangdong 528000, China;
    3. Information and Educational Technology Center, Foshan University, Foshan Guangdong 528000, China
  • Received:2015-12-07 Revised:2016-03-15 Online:2016-07-10 Published:2016-07-14
  • Supported by:
    This work is partially supported by National Natural Science Foundation of China (11474053), Foundation for Distinguished Young Talents in Higher Education of Guangdong, China (2013LYM_0097, 2014KQNCX184).

摘要: 针对k步可达性查询算法无法解决带距离约束的图可达性查询问题,提出基于参考节点嵌入的图可达性查询算法。首先,从所有节点中选出极少数有代表性的全局参考节点,预先计算所有节点与全局参考节点之间的最短路径距离;然后,采用最短路径树和范围最小值查询技术求得局部参考节点;接着,利用三角不等式关系得到查询点对距离范围;最后,根据查询条件中的距离值与查询点对距离范围上、下限值的大小关系,可快速得出可达性结论。针对社会关系网络和公路网络数据,将所提算法与Dijkstra算法、K-Reach算法进行实验对比测试。相较于K-Reach算法,其索引建立时间小4个数量级,其索引规模小2个数量级;相较于Dijkstra算法,在公路网络和社会关系网络中,直接得出可达性结论的比例分别为92%和78.6%,其查询时间大大缩短,分别降低了95.5%和92%。实验结果表明:所提算法能够通过使用较小的索引开销,实现在线查询计算复杂度的降低,可很好地解决既适用于有权图又适用于无权图带距离约束的可达性查询问题。

关键词: k步可达性查询, 带距离约束的图可达性查询, 参考节点嵌入, 三角不等式关系, 最短路径树

Abstract: Focusing on the issue that the problem of graph reachability query with distance constraint can not be solved by k-step reachability algorithm, a method of graph reachability query based on reference node embedding was proposed. Firstly, a very small part of representative global reference nodes were selected from all nodes, the shortest path distance between all nodes and these global reference nodes were previously calculated. Then the local reference nodes were obtained through the technology of the shortest path tree and range minimum query. Next, distance range was obtained based on the triangle inequality relation. Lastly, according to the relationship between the distance in query condition and the distance range, a conclusion of reachability could be drawn quickly. Comparative tests were done on data of social network and road network, the proposed algorithm was compared with Dijkstra algorithm and K-Reach algorithm. Compared with K-Reach algorithm, the indexing time of the proposed algorithm was four orders of magnitude shorter, and the index size was two orders of magnitude smaller. Compared with Dijkstra algorithm, the proportion of drawing a reachability conclusion directly by the proposed algorithm was 92% in the New York Road network, while 78.6% in the Digital Bibliography & Library Project (DBLP) network, moreover, the running time was shortened greatly, which was reduced by 95.5% and 92% respectively. The experimental results demonstrate that the computational complexity of online queries is reduced with a small index cost through the method proposed in this paper, which is a good solution to graph reachability query with distance constraint, and suitable for weighted graph as well as unweighted graph.

Key words: k-step reachability query, graph reachability query with distance constraint, reference node embedding, triangle inequality relation, shortest path tree

中图分类号: