《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (9): 2823-2829.DOI: 10.11772/j.issn.1001-9081.2021071326

• 网络与通信 • 上一篇    

信息熵改进主成分分析模型的链路预测算法

孟昱煜(), 郭静   

  1. 兰州交通大学 电子与信息工程学院,兰州 730070
  • 收稿日期:2021-07-23 修回日期:2021-10-22 接受日期:2021-10-25 发布日期:2021-11-01 出版日期:2022-09-10
  • 通讯作者: 孟昱煜
  • 作者简介:郭静(1997—),女,甘肃白银人,硕士研究生,主要研究方向:复杂网络、链路预测。

Link prediction algorithm based on information entropy improved PCA model

Yuyu MENG(), Jing GUO   

  1. School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou Gansu 730070,China
  • Received:2021-07-23 Revised:2021-10-22 Accepted:2021-10-25 Online:2021-11-01 Published:2022-09-10
  • Contact: Yuyu MENG
  • About author:GUO Jing, born in 1997, M. S. candidate. Her research interests include complex network, link prediction.

摘要:

针对传统的链路预测在不同结构特征的网络中的计算结果不稳定的问题,提出了基于信息熵改进主成分分析(PCA)模型的链路预测算法。首先,用随机森林(RF)确定7个相似性指标作为最佳特征集合;然后,将七个相似性指标组合在一起提出基于信息熵改进PCA的特征信息融合模型,在对特征信息赋予权重后,把该模型与单机制算法结合后在6个真实数据集上验证其正确性以及校验效果;最后,通过与混合链路预测算法比较曲线下面积(AUC)值来验证基于所提模型的链路预测算法的可行性和有效性。实验结果表明,所提出的链路预测算法比有序加权平均算法(OWA)和集成模型链路预测算法(EMLP)在预测精度AUC值上分别提升了2.5~12.46个百分点和0.47~9.01个百分点,具有较好的稳定性和准确性。可见,将所提算法应用到不同结构特征的网络中能得到更稳定、更准确的链路预测结果。

关键词: 复杂网络, 混合链路预测, 信息熵, 主成分分析, 特征融合

Abstract:

Aiming at the problem that traditional link prediction has computational results not stable in networks with different structures, a link prediction algorithm based on information entropy improved Principal Component Analysis (PCA) model was proposed. Firstly, seven similarity indexes were determined by Random Forest (RF) as the optimal feature set. Then, seven similarity indexes were combined to propose a feature information fusion model based on information entropy improved PCA. After weighting the feature information, the model was combined with the single mechanism algorithms to verify the correctness and verification effect of the model on six real-world datasets. Finally, the feasibility and effectiveness of the link prediction algorithm based on the proposed model were verified by comparing Area Under the Curve (AUC) values with the hybrid link prediction algorithms. Experimental results show that the proposed link prediction algorithms improve the AUC value by 2.5 to 12.46 percentage points and 0.47 to 9.01 percentage points, respectively, compared with Ordered Weighted Averaging aggregation operator (OWA) and Ensemble-Model-based Link Prediction algorithm (EMLP). It can be seen that applying the proposed algorithm to networks with different structural features can obtain more stable and accurate link prediction results.

Key words: complex network, hybrid link prediction, information entropy, Principal Component Analysis (PCA), feature fusion

中图分类号: