计算机应用 ›› 2015, Vol. 35 ›› Issue (12): 3487-3490.DOI: 10.11772/j.issn.1001-9081.2015.12.3487

• 人工智能 • 上一篇    下一篇

融合用户内容与关系结构的用户影响力算法

马慧芳1,2, 师亚凯1, 谢蒙1, 庄福振2   

  1. 1. 西北师范大学计算机科学与工程学院, 兰州 730070;
    2. 中国科学院计算技术研究所, 北京 100190
  • 收稿日期:2015-05-06 修回日期:2015-08-06 出版日期:2015-12-10 发布日期:2015-12-10
  • 通讯作者: 马慧芳(1981-),女,甘肃兰州人,副教授,博士,主要研究方向:数据挖掘、机器学习
  • 作者简介:师亚凯(1988-),男,河南平顶山人,硕士,CCF会员,主要研究方向:互联网数据挖掘;谢蒙(1990-),男,河北邢台人,硕士研究生,主要研究方向:互联网数据挖掘;庄福振(1983-),男,福建龙岩人,副研究员,博士,CCF会员,主要研究方向:迁移学习、数据挖掘、机器学习。
  • 基金资助:
    国家自然科学基金资助项目(61163039,61363058);甘肃省青年科技基金资助项目(145RJYA259);甘肃省自然科学研究基金资助项目(145RJZA232);甘肃省教育厅项目(2013B-007,2013A-016);中国科学院计算技术研究所智能信息处理重点实验室开放基金资助项目(IIP2014-4)。

User influence algorithm based on user content and relational structure

MA Huifang1,2, SHI Yakai1, XIE Meng1, ZHUANG Fuzhen2   

  1. 1. College of Computer Science and Engineering, Northwest Normal University, Lanzhou Gansu 730070, China;
    2. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2015-05-06 Revised:2015-08-06 Online:2015-12-10 Published:2015-12-10

摘要: 为快速检测出信息传播的途径,减少恶意信息造成的影响,提出了一种迭代的融合用户内容与关系结构的用户影响力算法(CSIAI)。该算法通过用户微博内容建模,迭代计算出词-用户文档的相似性;另外通过微博的关注和被关注行为,建立用户关系结构,计算用户影响力权值,得到用户的影响力邻接矩阵,提取k个较大影响力的节点作为信息传播的路径。在检测仿真实验中,CSIAI以影响覆盖率和响应时间作为评价指标,根据扩充后的新知识库,确定CSIAI中参数αβ的关系。随着用户数量增长,CSIAI的影响覆盖率和响应时间性能明显优于PageRank、CELF和非迭代的融合用户内容与关系结构的用户影响力算法(CSIA)。实验结果表明,CSIAI能有效地检测到信息的传播情况。

关键词: 微博内容, 用户关系, 影响力, 信息传播, 相似度迭代计算

Abstract: In order to rapidly detect the information dissemination ways and alleviate the influence of malicious information, a user Content and Structure-based Influence Algorithm with Iteration (CSIAI) was proposed. The word-user documentation similarity was iteratively computed by the proposed algorithm through the content modeling of user's microblog. Through the concern and attention behaviors of microblog, user relational structures were established and user influence weights were calculated to get the adjacency matrix of user influence. The k nodes with higher influence were extracted as the information transmission path. In the detection simulation experiments, the influence coverage rate and response time were adopted as the evaluation indexes, According to the expansion of the new knowledge base, the relationships of parameters α and β of CSIAI were determined based on the extended new knowledge base. With the increase of users, the influence coverage rate and response time performance of the proposed CSIAI are superior to the algorithms of PageRank, CELF and Content and Structure-based Influence Algorithm (CSIA) without iteration. The experimental results show that the proposed CSIAI can effectively detect the dissemination of microblog information.

Key words: microblog content, user relationship, influence, information dissemination, similarity iterative computation

中图分类号: