Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 190-198.DOI: 10.11772/j.issn.1001-9081.2023010071

• Data science and technology • Previous Articles    

Maximum cycle truss community search based on hierarchical tree index on directed graphs

Chuanyu ZONG, Chunhe ZHANG(), Xiufeng XIA   

  1. School of Computer Science,Shenyang Aerospace University,Shenyang Liaoning 110136,China
  • Received:2023-01-31 Revised:2023-04-05 Accepted:2023-04-13 Online:2023-06-06 Published:2024-01-10
  • Contact: Chunhe ZHANG
  • About author:ZONG Chuanyu, born in 1985, Ph. D., associate professor. His research interests include data cleaning, data provenance, query processing and optimization.
    XIA Xiufeng, born in 1964, Ph. D., professor. His research interests include database, data warehouse.
  • Supported by:
    National Natural Science Foundation of China(61802268);Natural Science Foundation of Liaoning Province(2022-MS-303)

有向图上基于层次树索引的最大cycle truss社区搜索

宗传玉, 张纯鹤(), 夏秀峰   

  1. 沈阳航空航天大学 计算机学院,沈阳 110136
  • 通讯作者: 张纯鹤
  • 作者简介:宗传玉(1985—),男,山东潍坊人,副教授,博士,CCF会员,主要研究方向:数据清洗、数据溯源、查询处理与优化;
    夏秀峰(1964—),男,山东胶南人,教授,博士,CCF会员,主要研究方向:数据库、数据仓库。
    第一联系人:张纯鹤(1997—),男,黑龙江哈尔滨人,硕士研究生,主要研究方向:查询处理与优化;
  • 基金资助:
    国家自然科学基金资助项目(61802268);辽宁省自然科学基金资助项目(2022-MS-303)

Abstract:

Community search aims to find highly cohesive connected subgraphs containing user query vertices in information networks. Cycle truss is a community search model based on cycle triangle. However, the existing index-based cycle truss community search methods suffer from the drawbacks of large index space, low search efficiency, and low community cohesion. A maximum cycle truss community search method based on hierarchical tree index was proposed to address this issue. Firstly, a k-cycle truss decomposition algorithm was proposed, and two important concepts, cycle triangle connectivity and k-level equivalence were introduced. Based on k-level equivalence, the hierarchical tree index TreeCIndex and the table index SuperTable were designed. On this basis, two efficient cycle truss community search algorithms were proposed. The proposed algorithms were compared with existing community search algorithms based on TrussIndex and EquiTruss on four real datasets. The experimental results show that the space consumptions of TreeCIndex and SuperTable are at least 41.5% lower and the index construction time is 8.2% to 98.3% lower compared to TrussIndex and EquiTruss; furthermore, the efficiencies of searching for maximum cycle truss communities is increased by one and two orders of magnitude.

Key words: directed graph, community search, cycle truss, cycle triangle, hierarchical equivalence, hierarchical tree index

摘要:

社区搜索旨在从信息网络中找出包含用户查询顶点的高内聚连通子图,cycletruss是一种基于cycle三角形的社区搜索模型,而现有的基于索引的cycle truss社区搜索方法存在索引空间大、搜索效率低、社区内聚性低的缺点。为了解决这一问题,提出一种基于层次树索引的最大cycle truss社区搜索方法。首先,提出了k-cycle truss分解算法,并引入了两个重要的概念:cycle三角连通与k-层次等价。基于k-层次等价设计了层次树索引TreeCIndex与表结构索引SuperTable,在此基础上,并基于这两个新的索引,提出了两个高效的cycle truss社区搜索算法。在4个真实数据集上与已有的基于TrussIndex与EquiTruss的社区搜索算法进行了比较,实验结果表明,TreeCIndex与SuperTable比TrussIndex与EquiTruss节省至少41.5%的空间,索引构建的时间节省8.2%至98.3%,且搜索最大cycle truss社区的效率分别高出了一个和两个数量级。

关键词: 有向图, 社区搜索, cycletruss, cycle三角形, 层次等价, 层次树索引

CLC Number: