计算机应用 ›› 2018, Vol. 38 ›› Issue (2): 464-470.DOI: 10.11772/j.issn.1001-9081.2017071820

• 数据科学与技术 • 上一篇    下一篇

基于k-核过滤的社交网络影响最大化算法

李阅志1,2, 祝园园1,2, 钟鸣1,2   

  1. 1. 软件工程国家重点实验室(武汉大学), 武汉 430072;
    2. 武汉大学 计算机学院, 武汉 430072
  • 收稿日期:2017-07-25 修回日期:2017-09-10 出版日期:2018-02-10 发布日期:2018-02-10
  • 通讯作者: 祝园园
  • 作者简介:李阅志(1993-),男,湖北黄冈人,硕士研究生,主要研究方向:社交网络、数据挖掘;祝园园(1984-),女,河南周口人,副教授,博士,CCF会员,主要研究方向:图数据库、大规模数据分析、数据挖掘;钟鸣(1982-),男,湖北孝感人,副教授,博士,CCF会员,主要研究方向:大规模数据分析、图查询与处理、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61502349);湖北省自然科学基金资助项目(2015CFB339);苏州市科技发展项目(SYG201442)。

k-core filtered influence maximization algorithms in social networks

LI Yuezhi1,2, ZHU Yuanyuan1,2, ZHONG Ming1,2   

  1. 1. State Key Laboratory of Software Engineering(Wuhan University), Wuhan Hubei 430072, China;
    2. School of Computer, Wuhan University, Wuhan Hubei 430072, China
  • Received:2017-07-25 Revised:2017-09-10 Online:2018-02-10 Published:2018-02-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61502349), the Hubei Provincial Natural Science Foundation (2015CFB339), the Scientific and Technologic Development Programm of Suzhou (SYG201442).

摘要: 针对现有社交网络影响最大化算法影响范围小和时间复杂度高的问题,提出一种基于独立级联模型的k-核过滤算法。首先,介绍了一种节点影响力排名不依赖于整个网络的现有影响力最大化算法;然后,通过预训练k,找到对现有算法具有最佳优化效果且与选择种子数无关的k值;最后,通过计算图的k-核过滤不属于k-核子图的节点和边,在k-核子图上执行现有影响最大化算法,达到降低计算复杂度的目的。为验证k-核过滤算法对不同算法有不同的优化效果,在不同规模数据集上进行了实验。结果显示,应用k-核过滤算法后:与原PMIA算法相比,影响范围最多扩大13.89%,执行时间最多缩短8.34%;与原核覆盖算法(CCA)相比,影响范围没有太大差异,但执行时间最多缩短28.5%;与OutDegree算法相比,影响范围最多扩大21.81%,执行时间最多缩短26.96%;与Random算法相比,影响范围最多扩大71.99%,执行时间最多缩短24.21%。进一步提出了一种新的影响最大化算法GIMS,它比PMIA和IRIE的影响范围更大,执行时间保持在秒级别,而且GIMS算法的k-核过滤算法与原GIMS算法的影响范围和执行时间差异不大。实验结果表明,k-核过滤算法能够增大现有算法选择种子节点集合的影响范围,并且减少执行时间;GIMS算法具有更好的影响范围效果和执行效率,并且更加鲁棒。

关键词: 社交网络, 影响最大化, k-核, 独立级联模型, 传播树

Abstract: Concerning the limited influence scope and high time complexity of existing influence maximization algorithms in social networks, a k-core filtered algorithm based on independent cascade model was proposed. Firstly, an existing influence maximization algorithm was introduced, its rank of nodes does not depend on the entire network. Secondly, pre-training was carried out to find the value of k which has the best optimization effect on existing algorithms but has no relation with the number of selected seeds. Finally, the nodes and edges that do not belong to the k-core subgraph were filtered by computing the k-core of the graph, then the existing influence maximization algorithms were applied on the k-core subgraph, thus reducing computational complexity. Several experiments were conducted on datasets with different scale to prove that the k-core filtered algorithm has different optimization effects on different influence maximization algorithms. After combined with k-core filtered algorithm, compared with the original Prefix excluding Maximum Influence Arborescence (PMIA) algorithm, the influence range is increased by 13.89% and the execution time is reduced by as much as 8.34%; compared with the original Core Covering Algorithm (CCA), the influence range has no obvious difference and the execution time is reduced by as much as 28.5%; compared with the original OutDegree algorithm, the influence range is increased by 21.81% and the execution time is reduced by as much as 26.96%; compared with the original Random algorithm, the influence range is increased by 71.99% and the execution time is reduced by as much as 24.21%. Furthermore, a new influence maximization algorithm named GIMS (General Influence Maximization in Social network) was proposed. Compared with PIMA and Influence Rank Influence Estimation (IRIE), it has wider influence range while still keeping execution time at second level. When it was combined with k-core filtered algorithm, the influence range and execution time do not have significant change. The experimiental results show that k-core filtered algorithm can effectively increase the influence ranges of existing algorithms and reduce their execution times; in addition, the proposed GIMS algorithm has wider influence range and better efficiency, and it is more robust.

Key words: social network, Influence Maximization (IM), k-core, independent cascade model, diffusion tree

中图分类号: