Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (2): 398-406.DOI: 10.11772/j.issn.1001-9081.2022010082

• Data science and technology • Previous Articles    

Parameter calculation algorithm of structural graph clustering driven by instance clusters

Chuanyu ZONG, Chao XIAN(), Xiufeng XIA   

  1. School of Computer Science,Shenyang Aerospace University,Shenyang Liaoning 110136,China
  • Received:2022-01-21 Revised:2022-04-06 Accepted:2022-04-11 Online:2022-04-25 Published:2023-02-10
  • Contact: Chao XIAN
  • About author:ZONG Chuanyu, born in 1985, Ph. D., associate professor. His research interests include data cleaning, data provenance, query processing and optimization.
    XIA Xiufeng, born in 1964, Ph. D., professor. His research interests include database, data warehouse.
  • Supported by:
    National Natural Science Foundation of China(61802268)


宗传玉, 宪超(), 夏秀峰   

  1. 沈阳航空航天大学 计算机学院,沈阳 110136
  • 通讯作者: 宪超
  • 作者简介:宗传玉(1985—),男,山东潍坊人,副教授,博士,CCF会员,主要研究方向:数据清洗、数据溯源、查询处理与优化
  • 基金资助:


Clustering results of the pSCAN (pruned Structural Clustering Algorithm for Network) algorithm are influenced by the density constraint parameter and the similarity threshold parameter. If the requirements cannot be satisfied by the clustering results obtained by the clustering parameters provided by the user, then the user’s own clustering requirements can be expressed through instance clusters. Aiming at the problem of instance clusters expressing clustering query requirements, an instance cluster-driven structural graph clustering parameter calculation algorithm PART and its improved algorithm ImPART were proposed. Firstly, the influences of two clustering parameters on the clustering results were analyzed, and correlation subgraph of instance cluster was extracted. Secondly, the feasible interval of the density constraint parameter was obtained by analyzing the correlation subgraph, and the nodes in the instance cluster were divided into core nodes and non-core nodes according to the current density constraint parameter and the structural similarity between nodes. Finally, according to the node division result, the optimal similarity threshold parameter corresponding to the current density constraint parameter was calculated, and the obtained parameters were verified and optimized on the relevant subgraph until the clustering parameters that satisfy the requirements of the instance cluster were obtained. Experimental results on real datasets show that a set of effective parameters can be returned for user instance clusters by using the proposed algorithm, and the proposed improved algorithm ImPART is more than 20% faster than the basic algorithm PART, and can return the optimal clustering parameters that satisfy the requirements of instance clusters quickly and effectively for the user.

Key words: structural graph clustering, instance cluster, parameter calculation, node division, optimal parameter



关键词: 图结构聚类, 实例簇, 参数计算, 节点划分, 最优参数

CLC Number: