计算机应用 ›› 2014, Vol. 34 ›› Issue (8): 2179-2183.DOI: 10.11772/j.issn.1001-9081.2014.08.2179

• 第五届中国数据挖掘会议(CCDM 2014)论文 • 上一篇    下一篇

基于链接重要性和数据场的链接预测算法

陈巧玉,班志杰   

  1. 内蒙古大学 计算机学院,呼和浩特010021
  • 收稿日期:2014-05-04 修回日期:2014-05-13 出版日期:2014-08-01 发布日期:2014-08-10
  • 通讯作者: 班志杰
  • 作者简介:陈巧玉(1986-),女,河北唐山人,硕士研究生,主要研究方向:数据挖掘、复杂网络分析;班志杰(1976-),女,内蒙古赤峰人,副教授,主要研究方向:数据挖掘、在线网络分析。
  • 基金资助:

    国家自然科学基金资助项目;内蒙古自治区高等学校科学研究项目;内蒙古自治区自然科学基金资助项目

Link prediction algorithm based on link importance and data field

CHEN Qiaoyu,BAN Zhijie   

  1. School of Computer Science, Inner Mongolia University, Huhhot Nei Mongol 010021, China
  • Received:2014-05-04 Revised:2014-05-13 Online:2014-08-01 Published:2014-08-10
  • Contact: BAN Zhijie
  • Supported by:

    ;Inner Mongolia Autonomous Region Higher Scientific Research Project

摘要:

针对现有基于节点相似性的链接预测方法忽略了网络拓扑本身链接强度的信息,带权的拓扑路径方法中权值较难确定等缺陷,提出一种基于链接重要性和数据场的链接预测算法。首先,将所有链接边赋予不同的链接权重;其次,考虑潜在链接节点间的相互影响,对部分没有链接的节点进行链接预估计;最后,利用数据场势函数计算两节点间的相似值。在典型的网络数据进行的实验结果表明,所提方法在分类指标和推荐指标中都有很好的表现:以AUC为评价指标时,比同复杂度的局部路径(LP)算法提高了3到6个百分点;以DCG为评价指标时比LP算法提高了1.5到2.5个DCG值。算法整体上提高了预测准确性,且由于参数确定简单,复杂度又不高,在实际中易于部署。

Abstract:

The existing link prediction methods based on node similarity usually ignore the link strength of network topology and the weight value in the typological path method with weight is difficult to set. To solve these problems, a new prediction algorithm based on link importance and data field was proposed. Firstly, this method assigned different weight for each link according to the topology graph. Secondly, it took into account the interaction between potential link nodes and pre-estimated the link values for the partial nodes without links. Finally, it calculated the similarity between two nodes with data field potential function. The experimental results on some typical data sets of the real-world network show that, the proposed method has good performance with both classification index and recommended index. In comparison to the Local Path (LP) algorithm with the same complexity, the proposed algorithm raises Area Under Curve (AUC) by 3 to 6 percentages, and raises Discounted Cumulative Gain (DCG) by 1.5 to 2.5 points. On the whole, it improves the prediction accuracy. Because of its easy parameter determination and low time complexity, this new approach can be deployed simply.

中图分类号: