基于用户传播网络与消息内容融合的谣言检测模型

doi:10.11772/j.issn.1001-9081.2021060963

《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3540-3545.DOI: 10.11772/j.issn.1001-9081.2021060963

• 第十八届中国机器学习会议(CCML 2021) • 上一篇

基于用户传播网络与消息内容融合的谣言检测模型

薛海涛¹, 王莉¹(), 杨延杰¹, 廉飚²

^1.太原理工大学大数据学院，太原 030600
^2.北方自动控制技术研究所，太原 030006

收稿日期:2021-05-12 修回日期:2021-06-25 接受日期:2021-07-04 发布日期:2021-12-28 出版日期:2021-12-10
通讯作者: 王莉
作者简介:薛海涛（1997—），男，山西介休人，硕士研究生，主要研究方向：自然语言处理、谣言检测
杨延杰（1995—），男，山西原平人，硕士研究生，主要研究方向：自然语言处理、数据挖掘
廉飚（1987—），男，山西太原人，硕士，主要研究方向：软件开发、数据挖掘。
基金资助:
国家自然科学基金资助项目(61872260)

Rumor detection model based on user propagation network and message content

Haitao XUE¹, Li WANG¹(), Yanjie YANG¹, Biao LIAN²

^1.College of Data Science，Taiyuan University of Technology，Taiyuan Shanxi 030600，China
^2.North Automatic Control Technology Institute，Taiyuan Shanxi 030006，China

Received:2021-05-12 Revised:2021-06-25 Accepted:2021-07-04 Online:2021-12-28 Published:2021-12-10
Contact: Li WANG
About author:XUE Haitao， born in 1997， M. S. candidate. His research interests include natural language processing， rumor detection.
YANG Yanjie， born in 1995， M. S. candidate. His research interests include natural language processing， data mining.
LIAN Biao， born in 1987， M. S. His research interests include software development， data mining.
Supported by:
the National Natural Science Foundation of China(61872260)

摘要/Abstract

摘要：

针对社交媒体平台上消息内容普遍很短、传播结构中存在大量空转发、用户角色与内容间的失配等条件约束，提出了一种基于传播网络中的用户属性信息和消息内容的谣言检测模型GMB_GMU。首先以用户属性为节点、传播链为边构建用户传播网络，并引入图注意力网络（GAT）得到用户属性的增强表示；同时，基于此用户传播网络，利用node2vec得到用户的结构表征，并使用互注意机制对其进行增强。另外，引入BERT建立源帖内容表征。最后，利用多模态门控单元（GMU）对用户属性表征、结构表征和源帖内容表征进行融合，从而得到消息的最终表征。实验结果表明，GMB_GMU模型在公开的Weibo数据上的准确率达到0.952，能够有效识别谣言事件，效果明显优于基于循环神经网络（RNN）和其他神经网络基准模型的传播算法。

关键词: 谣言检测, 用户属性, 图注意力网络, 多模态门控单元, 传播网络

Abstract:

Under the constrains of very short message content on social media platforms， a large number of empty forwards in the transmission structure， and the mismatch between user roles and contents， a rumor detection model based on user attribute information and message content in the propagation network， namely GMB_GMU， was proposed. Firstly， user propagation network was constructed with user attributes as nodes and propagation chains as edges， and Graph Attention neTwork （GAT） was introduced to obtain an enhanced representation of user attributes； meanwhile， based on this user propagation network， the structural representation of users was obtained by using node2vec， and it was enhanced by using mutual attention mechanism. In addition， BERT （Bidirectional Encoder Representations from Transformers） was introduced to establish the source post content representation of the source post. Finally， to obtain the final message representation， Gated Multimodal Unit （GMU） was used to integrate the user attribute representation， structural representation and source post content representation. Experimental results show that the GMB_GMU model achieves an accuracy of 0.952 on publicly available Weibo data and can effectively identify rumor events， which is significantly better than the propagation algorithms based on Recurrent Neural Network （RNN） and other neural network benchmark models.

Key words: rumor detection, user attribute, Graph Attention neTwork (GAT), Gated Multimodal Unit (GMU), propagation network

中图分类号:

TP391

薛海涛, 王莉, 杨延杰, 廉飚. 基于用户传播网络与消息内容融合的谣言检测模型[J]. 计算机应用, 2021, 41(12): 3540-3545.

Haitao XUE, Li WANG, Yanjie YANG, Biao LIAN. Rumor detection model based on user propagation network and message content[J]. Journal of Computer Applications, 2021, 41(12): 3540-3545.

图/表 10

参考文献 22

1	CASTILLO C， MENDOZA M， POBLETE B. Information credibility on Twitter［C］// Proceedings of the 20th International Conference on World Wide Web. New York： ACM， 2011：675-684. 10.1145/1963405.1963500
2	QIAN F， GONG C Y， SHARMA K， et al. Neural user response generator： fake news detection with collective user intelligence［C］// Proceedings of the 27th International Joint Conference on Artificial Intelligence. ［S.l.］： IJCAI Organization， 2018： 3834-3840. 10.24963/ijcai.2018/533
3	LIU Y， WU Y F B. Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2018：354-361. 10.1609/aaai.v33i01.33015644
4	MA J， GAO W， WONG K F. Rumor detection on Twitter with tree-structured recursive neural networks［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2018： 1980-1989. 10.18653/v1/p18-1184
5	RASHKIN H， CHOI E， JANG J Y， et al. Truth of varying shades： analyzing language in fake news and political fact-checking［C］// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2017：2931-2937. 10.18653/v1/d17-1317
6	段大高，谢永恒，盖新新，等. 基于神经网络的微博虚假消息识别模型［J］. 信息网络安全， 2017（9）：134-137. 10.3969/j.issn.1671-1122.2017.09.031
	DUAN D G， XIE Y H， GAI X X， et al. A rumor detection modal based on neural network［J］. Netinfo Security， 2017（9）：134-147. 10.3969/j.issn.1671-1122.2017.09.031
7	刘政，卫志华，张韧弦. 基于卷积神经网络的谣言检测［J］. 计算机应用， 2017， 37（11）：3053-3056， 3100. 10.11772/j.issn.1001-9081.2017.11.3053
	LIU Z， WEI Z H， ZHANG R X. Rumor detection based on convolutional neural network［J］. Journal of Computer Applications， 2017， 37（11）：3053-3056， 3100. 10.11772/j.issn.1001-9081.2017.11.3053
8	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2013：3111-3119.
9	FANG Y， GAO J， HUANG C， et al. Self multi-head attention-based convolutional neural networks for fake news detection［J］. PLoS ONE， 2019， 14（9）： No.e0222713. 10.1371/journal.pone.0222713
10	VAIBHAV V， ANNASAMY R M， HOVY E. Do sentence interactions matter？ leveraging sentence level representations for fake news classification［C］// Proceedings of the 13th Workshop on Graph-Based Methods for Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2019： 134-139. 10.18653/v1/d19-5316
11	WANG Y H， WANG L， YANG Y J， et al. SemSeq4FD： integrating global semantic relationship and local sequential order to enhance text representation for fake news detection［J］. Expert Systems with Applications， 2020， 166： No.114090. 10.1016/j.eswa.2020.114090
12	LIU Y H， JIN X L， SHEN H W， et al. Do rumors diffuse differently from non-rumors？ a systematically empirical analysis in Sina Weibo for rumor identification［C］// Proceedings of the 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining， LNCS10234. Cham： Springer， 2017： 407-420. 10.1007/978-3-319-57454-7_32
13	SHU K， MAHUDESWARAN D， WANG S H， et al. Hierarchical propagation networks for fake news detection： investigation and exploitation［C］// Proceedings of the 14th International AAAI Conference on Web and Social Media. Palo Alto， CA： AAAI Press， 2020： 626-637. 10.1089/big.2020.0062
14	BIAN T， XIAO X， XU T Y， et al. Rumor detection on social media with bi-directional graph convolutional networks［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 549-556. 10.1609/aaai.v34i01.5393
15	VELIČKOVIĆ P， CUCURULL G， CASANOVA A， et al. Graph attention networks［EB/OL］. （2018-02-04）［2020-10-10］..
16	GROVER A， LESKOVEC J. node2vec： scalable feature learning for networks［C］// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2016： 855-864. 10.1145/2939672.2939754
17	MISHRA R. Fake news detection using higher-order user to user mutual-attention progression in propagation paths［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2020： 2775-2783. 10.1109/cvprw50498.2020.00334
18	DAVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2019： 4171-4186.
19	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017：6000-6010. 10.1016/s0262-4079(17)32358-8
20	AREVALO J， SOLORIO T， MONTES-Y-GOMÓZ M， et al. Gated multimodal networks［J］. Neural Computing and Applications， 2020， 32（14）： 10209-10228. 10.1007/s00521-019-04559-1
21	MA J， GAO W， MITRA P， et al. Detecting rumors from microblogs with recurrent neural networks［C］// Proceedings of the 25th International Joint Conference on Artificial Intelligence. ［S.l.］： IJCAI Organization， 2016： 3818-3824.
22	YANG F， LIU Y， YU X H， et al. Automatic detection of rumor on Sina Weibo［C］// Proceedings of the 18th ACM SIGKDD Workshop on Mining Data Semantics. New York： ACM， 2012： No.13. 10.1145/2350190.2350203

统计信息	数量
事件数量	4 664
真实事件数量	2 351
虚假事件数量	2 313
用户数量	2 746 818
帖子数	3 805 656

统计信息	数量
事件数量	4 664
真实事件数量	2 351
虚假事件数量	2 313
用户数量	2 746 818
帖子数	3 805 656

特征	描述
reposts_count	帖子的转发数
bi_followers_count	互相关注的数量
friends_count	关注数
followers_count	粉丝数
statuses_count	发表帖子数
verified	是否验证
favourits_count	最喜欢的帖子数
comments_count	评论数
t	用户转发时间戳

特征	描述
reposts_count	帖子的转发数
bi_followers_count	互相关注的数量
friends_count	关注数
followers_count	粉丝数
statuses_count	发表帖子数
verified	是否验证
favourits_count	最喜欢的帖子数
comments_count	评论数
t	用户转发时间戳

方法	准确率	精确率		召回率		F1值
方法	准确率	谣言	非谣言	谣言	非谣言	谣言	非谣言
DTC	0.831	0.815	0.847	0.825	0.815	0.819	0.831
SVM-RBF	0.879	0.579	0.777	0.708	0.656	0.615	0.708
TD-RvNN	0.908	0.904	0.912	0.918	0.897	0.911	0.905
PPC_RNN+CNN	0.913	0.927	0.884	0.901	0.932	0.907	0.922
HiMap-HO+Text	0.892	0.892	0.892	0.878	0.910	0.896	0.888
BiGCN	0.905	0.919	0.894	0.895	0.916	0.900	0.898
GMB_GMU	0.952	0.934	0.968	0.967	0.939	0.950	0.953

基于用户传播网络与消息内容融合的谣言检测模型

Rumor detection model based on user propagation network and message content

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 22

相关文章 2

编辑推荐

Metrics

[1]	刘政, 卫志华, 张韧弦. 基于卷积神经网络的谣言检测[J]. 计算机应用, 2017, 37(11): 3053-3056.
[2]	杨文太, 梁刚, 谢凯, 杨进, 许春. 基于突发话题和领域专家的微博谣言检测方法[J]. 计算机应用, 2017, 37(10): 2799-2805.