计算机应用 ›› 2020, Vol. 40 ›› Issue (1): 28-35.DOI: 10.11772/j.issn.1001-9081.2019061008

• 人工智能 • 上一篇    下一篇

融合聚集系数的链接预测方法

刘昱阳, 李龙杰, 单娜, 陈晓云   

  1. 兰州大学 信息科学与工程学院, 兰州 730000
  • 收稿日期:2019-06-14 修回日期:2019-09-18 出版日期:2020-01-10 发布日期:2019-10-08
  • 通讯作者: 李龙杰
  • 作者简介:刘昱阳(1996-),男,山西临汾人,硕士研究生,主要研究方向:网络信息挖掘、链接预测;李龙杰(1981-),男,河南商丘人,工程师,博士,CCF会员,主要研究方向:数据挖掘、机器学习、网络信息挖掘;单娜(1996-),女,河北保定人,硕士研究生,主要研究方向:网络信息挖掘、链接预测;陈晓云(1954-),女,吉林长春人,教授,博士生导师,CCF高级会员,硕士,主要研究方向:数据挖掘、大数据分析、人工智能、网络信息挖掘。
  • 基金资助:
    国家自然科学基金青年基金资助项目(61602225);中央高校基本科研业务费专项(lzujbky-2019-90)。

Link prediction method fusing clustering coefficients

LIU Yuyang, LI Longjie, SHAN Na, CHEN Xiaoyun   

  1. School of Information Science and Engineering, Lanzhou University, Lanzhou Gansu 730000, China
  • Received:2019-06-14 Revised:2019-09-18 Online:2020-01-10 Published:2019-10-08
  • Supported by:
    This work is partially supported by the Youth Program of National Natural Science Foundation of China (61602225), the Fundamental Research Funds for the Central Universities (lzujbky-2019-90).

摘要: 许多基于网络结构信息的链接预测算法利用节点的聚集程度评估节点间的相似性,进而执行链接预测;然而,该类算法只注重网络中节点的聚集系数,没有考虑预测节点与共同邻居节点之间的链接聚集系数对节点间相似性的影响。针对上述问题,提出了一种融合节点聚集系数和非对称链接聚集系数的链接预测算法。首先,计算共同邻居节点的聚集系数,并利用共同邻居节点对应的两个非对称链接聚集系数计算该预测节点的平均链接聚集系数;然后,基于Dempster-Shafer证据理论将两种聚集系数进行融合生成一个综合性度量指标,并将该指标应用于中间概率模型(IMP),得到一个新的节点相似性指标(IMP_DS)。在9个网络数据上的实验结果表明,该算法的受试者工作特征(ROC)的曲线下方面积(AUC)与精度值(Precision)优于共同邻居(CN)、Adamic-Adar(AA)、资源分配(RA)指标和基于共同邻居的中间概率模型(IMP_CN)。

关键词: 链接预测, 复杂网络, Dempster-Shafer理论, 聚集系数, 相似性指标

Abstract: Many network structure information-based link prediction algorithms estimate the similarity between nodes and perform link prediction by using the clustering degree of nodes. However, these algorithms only focus on the clustering coefficient of nodes in network, and do not consider the influence of link clustering coefficient between the predicted nodes and their common neighbor nodes on the similarity between nodes. Aiming at the problem, a link prediction algorithm combining node clustering coefficient and asymmetric link clustering coefficient was proposed. Firstly, the clustering coefficient of common neighbor node was calculated, and the average link clustering coefficient of the predicted nodes was obtained by using two asymmetric link clustering coefficients of common neighbor node. Then, a comprehensive measurement index was obtained by fusing these two clustering coefficients based on Dempster-Shafer(DS) theory, and by applying the index to Intermediate Probability Model (IMP), a new node similarity index, named IMP_DS, was designed. The experimental results on the data of nine networks show that the proposed algorithm achieves performance in terms of Area Under the Curve (AUC) of Receiver Operating Characteristic (ROC) and Precision in comparison with Common Neighbor (CN), Adamic-Adar (AA), Resource Allocation (RA) indexes and InterMediate Probability model based on Common Neighbor (IMP_CN).

Key words: link prediction, complex network, Dempster-Shafer (DS) theory, clustering coefficient, similarity index

中图分类号: