计算机应用 ›› 2019, Vol. 39 ›› Issue (1): 213-219.DOI: 10.11772/j.issn.1001-9081.2018061321

• 数据科学与技术 • 上一篇    下一篇

面向微博话题的用户影响力分析算法

刘威1, 张明新2, 安德智3   

  1. 1. 苏州大学 计算机科学与技术学院, 江苏 苏州 215006;
    2. 常熟理工学院 计算机科学与工程学院, 江苏 常熟 215500;
    3. 甘肃政法学院 网络空间安全学院, 兰州 730070
  • 收稿日期:2018-06-25 修回日期:2018-08-16 出版日期:2019-01-10 发布日期:2019-01-21
  • 通讯作者: 张明新
  • 作者简介:刘威(1993-),男,安徽阜阳人,硕士研究生,主要研究方向:数据挖掘;张明新(1962-),男,山西忻州人,教授,博士,CCF会员,主要研究方向:云计算、数据挖掘、智能控制;安德智(1973-),男,甘肃兰州人,教授,主要研究方向:网络空间安全、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61363024)。

User influence analysis algorithm for Weibo topics

LIU Wei1, ZHANG Mingxin2, AN Dezhi3   

  1. 1. School of Computer Science and Technology, Soochow University, Suzhou Jiangsu 215006, China;
    2. School of Computer Science and Engineering, Changshu Institute of Technology, Changshu Jiangsu 215500, China;
    3. School of Cyber Security, Gansu Institute of Political Science and Law, Lanzhou Gansu 730070, China
  • Received:2018-06-25 Revised:2018-08-16 Online:2019-01-10 Published:2019-01-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61363024).

摘要: 微博用户影响力分析作为社交网络分析的重要组成部分,一直受到研究人员的关注。针对现有研究工作分析用户行为时间性的不足和忽略用户与参与话题之间关联性等问题,提出了一种面向微博话题的用户影响力分析算法——基于话题和传播能力的用户排序(TSRank)算法。首先,基于微博话题分析用户转发行为时间性,进一步构建用户转发和用户博文转发两种话题转发关系网络,预测用户话题信息传播能力;然后,分析用户个人历史微博和背景话题微博文本内容,挖掘用户与背景话题之间的关联性;最后,综合考虑用户话题信息传播能力以及用户与背景话题间关联性计算微博用户影响力。爬取新浪微博真实话题数据进行实验,实验结果表明,话题关联度更高用户的话题转发量明显大于关联度很低的用户,引入用户转发行为时间性相比无转发时间性,TSRank算法的捕获率(CR)提高了18.7%,进一步与典型影响力分析算法WBRank、TwitterRank和PageRank相比,TSRank算法在准确率和召回率上分别提高了5.9%、8.7%、13.1%和6.7%、9.1%、14.2%,验证了TSRank算法的有效性。该研究成果对社交网络的社会属性、话题传播等理论研究以及好友推荐、舆情监控等应用研究具有支撑作用。

关键词: 社交网络, 用户影响力, 转发关系, 微博话题, 信息传播能力

Abstract: As an important part of social network analysis, Weibo user influence analysis has been concerned by researchers all the time. Concerning the timeliness shortage and neglect of the relevance between users and topics when analyzing user behaviors, a user influence analysis algorithm for Weibo topics, named Topic and Spread user Rank (TSRank), was proposed. Firstly, based on Weibo topics, the timeliness of user's forwarding behavior was analyzed to construct two topic forwarding networks, user forwarding and user blog forwarding, in order to predict the user's topic information dissemination capability. Secondly, the text contents of user's personal history Weibo and background topic Weibo were analyzed to mine the relevance between user and background topic. Finally, the influence of Weibo user was calculated by comprehensively considering user's topic information dissemination capability and relevance between user and background topic. The experiments on crawled real topic data of Sina Weibo were conducted. The experimental results show that the topic forwarding number of users with higher topic correlation is significantly greater than that of users with lower topic correlation. Compared with no forwarding timeliness, the Catch Ratio (CR) of TSRank algorithm is increased by 18.7%, which is further compared with typical influence analysis algorithms, such as WBRank, TwitterRank and PageRank, TSRank algorithm improves the precision and recall by 5.9%, 8.7%, 13.1% and 6.7%, 9.1%, 14.2% respectively, which verifies the effectiveness of TSRank algorithm. The research results can support theoretical research of social attributes and topic forwarding of social networks as well as the application research of friend recommendation and public opinion monitoring.

Key words: social network, user influence, forward relationship, Weibo topic, information dissemination capability

中图分类号: