基于异构图注意力网络的微博谣言监测模型

doi:10.11772/j.issn.1001-9081.2021060981

《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3546-3550.DOI: 10.11772/j.issn.1001-9081.2021060981

• 第十八届中国机器学习会议(CCML 2021) • 上一篇

基于异构图注意力网络的微博谣言监测模型

毕蓓¹^,², 潘慧瑶¹, 陈峰¹, 隋京言⁴, 高扬³, 王耀君¹()

^1.中国农业大学信息与电气工程学院，北京 100083
^2.北京理工大学计算机学院，北京 100081
^3.北京工业大学经济与管理学院，北京 100124
^4.中国科学院计算技术研究所，北京 100190

收稿日期:2021-05-12 修回日期:2021-07-05 接受日期:2121-07-05 发布日期:2021-12-28 出版日期:2021-12-10
通讯作者: 王耀君
作者简介:毕蓓（2000—），女，山东菏泽人，硕士研究生，主要研究方向：图神经网络、联邦学习
潘慧瑶（1999—），女，湖北荆州人，硕士研究生，主要研究方向：文本挖掘、知识图谱
陈峰（1998—），男，浙江长兴人，硕士研究生，主要研究方向：虚拟现实、情感计算，智能设计
隋京言（1982—），女，山东烟台人，博士研究生，主要研究方向：深度强化学习、组合优化、算法设计与分析
高扬（1988—），女，山东烟台人，副教授，博士，主要研究方向：金融时间序列分析；
基金资助:
北京市自然科学基金青年项目(5214026);中国农业大学2115人才工程

Microblog rumor detection model based on heterogeneous graph attention network

Bei BI¹^,², Huiyao PAN¹, Feng CHEN¹, Jingyan SUI⁴, Yang GAO³, Yaojun WANG¹()

^1.College of Information and Electrical Engineering，China Agricultural University，Beijing 100083，China
^2.School of Computer Science and Technology，Beijing Institute of Technology，Beijing 100081，China
^3.College of Economics and Management，Beijing University of Technology，Beijing 100124，China
^4.Institute of Computing Technology，Chinese Academy of Sciences，Beijing 100190，China

Received:2021-05-12 Revised:2021-07-05 Accepted:2121-07-05 Online:2021-12-28 Published:2021-12-10
Contact: Yaojun WANG
About author:BI Bei， born in 2000， M. S. candidate. Her research interests include graph neural network， federated learning.
PAN Huiyao， born in 1999， M. S. candidate. Her research interests include text mining， knowledge graph.
CHEN Feng， born in 1998， M. S. candidate. His research interests include virtual reality， affective computing， intelligent design.
SUI Jingyan， born in 1982， Ph. D. candidate. Her research interests include deep reinforcement learning， combinatorial optimization， algorithm design and analysis.
GAO Yang， born in 1988， Ph. D.， associate professor. Her research interests include financial time series analysis.
Supported by:
the Youth Program of Beijing Natural Science Foundation(5214026);the 2115 Talent Development Program of China Agricultural University

摘要/Abstract

摘要：

社交媒体方便了人们的日常交流和信息传播，同时也是谣言滋生和传播的温床，因此如何在谣言传播早期自动监测极具现实意义，而现有的检测方法没有充分利用微博信息传播图的语义信息。为了解决这个问题，基于异构图注意力网络（HAN）构建了谣言监测模型MicroBlog-HAN。该模型采用含有节点级注意力和语义级注意力的分层注意力机制。首先，节点级注意力结合微博节点的邻居生成两组具有特定语义的节点嵌入；然后，语义级注意力融合不同语义，得到最终的节点嵌入，并输入到分类器中执行二分类任务；最后，给出输入微博是谣言还是非谣言的分类结果。在两个真实的微博谣言数据集上的实验结果表明，MicroBlog-HAN模型可以实现微博谣言较准确的识别，准确率超过87%。

关键词: 微博, 谣言监测, 异构图, 元路径, 异构图注意力网络

Abstract:

Social media highly facilitates people’s daily communication and disseminating information， but it is also a breeding ground for rumors. Therefore， how to automatically monitor rumor dissemination in the early stage is of great practical significance， but the existing detection methods fail to take full advantage of the semantic information of the microblog information propagation graph. To solve this problem， based on Heterogeneous graph Attention Network （HAN）， a rumor monitoring model was built， namely MicroBlog-HAN. In the model， a hierarchical attention mechanism including node-level attention and semantic-level attention was adopted. First， the neighbors of microblog nodes were combined by the node-level attention to generate two groups of node embeddings with specific semantics. After that， different semantics were fused by the semantic-level attention to obtain the final node embeddings of microblog， which were then treated as the classifier’s input to perform the binary classification task. In the end， the classification result of whether the input microblog is rumor or not was given. Experimental results on two real-world microblog rumor datasets convincingly prove that MicroBlog-HAN model can accurately identify microblog rumors with an accuracy over 87%.

Key words: microblog, rumor detection, heterogeneous graph, meta-path, Heterogeneous graph Attention Network (HAN)

中图分类号:

TP391

毕蓓, 潘慧瑶, 陈峰, 隋京言, 高扬, 王耀君. 基于异构图注意力网络的微博谣言监测模型[J]. 计算机应用, 2021, 41(12): 3546-3550.

Bei BI, Huiyao PAN, Feng CHEN, Jingyan SUI, Yang GAO, Yaojun WANG. Microblog rumor detection model based on heterogeneous graph attention network[J]. Journal of Computer Applications, 2021, 41(12): 3546-3550.

图/表 6

参考文献 21

1	LI C Y， LIU H J， HU Q， et al. A novel computational model for predicting microRNA-disease associations based on heterogeneous graph convolutional networks［J］. Cells， 2019， 8（9）： No.977. 10.3390/cells8090977
2	WANG H， ZHENG W S， LING Y B. Contextual heterogeneous graph network for human-object interaction detection［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS12362. Cham： Springer， 2020：248-264. 10.1007/978-3-030-58520-4_15
3	SUN X Q， WANG Z L， YANG J H， et al. Deepdom： malicious domain detection with scalable and heterogeneous graph convolutional networks［J］. Computers and Security， 2020， 99： No.102057. 10.1016/j.cose.2020.102057
4	WANG X， JI H Y， SHI C， et al. Heterogeneous graph attention network［C］// Proceedings of the 2019 World Wide Web Conference. New York： ACM， 2019： 2022-2032. 10.1145/3308558.3313562
5	CASTILLO C， MENDOZA M， POBLETE B. Information credibility on Twitter［C］// Proceedings of the 20th International Conference on World Wide Web. New York： ACM， 2011：675-684. 10.1145/1963405.1963500
6	KWON S， CHA M， JUNG K. Rumor detection over varying time windows［J］. PLoS ONE， 2017， 12（1）： No.e0168344. 10.1371/journal.pone.0168344
7	YANG F， LIU Y， YU X H， et al. Automatic detection of rumor on Sina Weibo［C］// Proceedings of the 2012 ACM SIGKDD Workshop on Mining Data Semantics. New York： ACM， 2012： No.13. 10.1145/2350190.2350203
8	MA J， GAO W， WEI Z Y， et al. Detect rumors using time series of social context information on microblogging websites［C］// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York： ACM， 2015：1751-1754. 10.1145/2806416.2806607
9	MA J， GAO W， WONG K F. Detect rumors in microblog posts using propagation structure via kernel learning［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2017： 708-717. 10.18653/v1/p17-1066
10	MA J， GAO W， MITRA P， et al. Detecting rumors from microblogs with recurrent neural networks［C］// Proceedings of the 25th International Joint Conference on Artificial Intelligence. ［S.l.］： IJCAI Organization， 2016： 3818-3824.
11	CHEN T， LI X， YIN H Z， et al. Call attention to rumors： deep attention based recurrent neural networks for early rumor detection［C］// Proceedings of the 2018 Pacific-Asia Conference on Knowledge Discovery and Data Mining， LNCS11154. Cham： Springer， 2018：40-52.
12	YU F， LIU Q， WU S， et al. A convolutional approach for misinformation identification［C］// Proceedings of the 26th International Joint Conference on Artificial Intelligence. ［S.l.］： IJCAI Organization， 2017：3901-3907. 10.24963/ijcai.2017/545
13	LIU Y， WU Y F B. Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2018：354-361. 10.1609/aaai.v33i01.33015644
14	MA J， GAO W， WONG K F. Detect rumors on Twitter by promoting information campaigns with generative adversarial learning［C］// Proceedings of the 2019 World Wide Web Conference. New York： ACM， 2019：3049-3055. 10.1145/3308558.3313741
15	GORI M， MONFARDINI G， SCARSELLI F. A new model for learning in graph domains［C］// Proceedings of the 2005 IEEE International Joint Conference on Neural Networks. Piscataway： IEEE， 2005：729-734. 10.1109/ijcnn.2005.1555942
16	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks［EB/OL］. （2017-02-22）［2021-06-02］..
17	PEI H B， WEI B Z， CHANG K C C， et al. Geom-GCN： geometric graph convolutional networks［EB/OL］. （2020-02-14）［2021-06-02］..
18	VELIČKOVIĆ P， CUCURULL G， CASANOVA A， et al. Graph attention networks［EB/OL］. （2018-02-04）［2021-06-02］..
19	YUAN C Y， MA Q W， ZHOU W， et al. Jointly embedding the local and global relations of heterogeneous graph for rumor detection［C］// Proceedings of the 2019 IEEE International Conference on Data Mining. Piscataway： IEEE， 2019：796-805. 10.1109/icdm.2019.00090
20	SUN Y Z， HAN J W， YAN X F， et al. PathSim： meta path-based top-k similarity search in heterogeneous information networks［J］. Proceedings of the VLDB Endowment， 2011， 4（11）：992-1003. 10.14778/3402707.3402736
21	ZHAO Z， RESNICK P， MEI Q Z. Enquiring minds： early detection of rumors in social media from enquiry posts［C］// Proceedings of the 24th International Conference on World Wide Web. Republic and Canton of Geneva： International World Wide Web Conferences Steering Committee， 2015：1395-1405. 10.1145/2736277.2741637

数据集	微博总数	非谣言数	谣言数	用户数	转发/评论数
Weibo2016	4 664	2 351	2 313	2 746 818	3 805 656
Weibo2021	1 018	519	499	185 653	233 466

数据集	微博总数	非谣言数	谣言数	用户数	转发/评论数
Weibo2016	4 664	2 351	2 313	2 746 818	3 805 656
Weibo2021	1 018	519	499	185 653	233 466

模型	类别	准确率	精确率	召回率	F1
DTR	谣言	0.732	0.738	0.715	0.726
DTR	非谣言	0.732	0.726	0.749	0.737
DTC	谣言	0.831	0.847	0.815	0.831
DTC	非谣言	0.831	0.815	0.847	0.830
RFC	谣言	0.849	0.786	0.959	0.864
RFC	非谣言	0.849	0.947	0.739	0.830
SVM-RBF	谣言	0.818	0.822	0.812	0.817
SVM-RBF	非谣言	0.818	0.815	0.824	0.819
SVM-TS	谣言	0.857	0.839	0.885	0.861
SVM-TS	非谣言	0.857	0.878	0.830	0.857
GRU	谣言	0.910	0.876	0.956	0.914
GRU	非谣言	0.910	0.952	0.864	0.906
MHAN	谣言	0.912	0.894	0.899	0.910
MHAN	非谣言	0.912	0.930	0.926	0.914
MHAN_WUW	谣言	0.895	0.934	0.848	0.889
MHAN_WUW	非谣言	0.895	0.863	0.941	0.900
MHAN_WPUPW	谣言	0.907	0.898	0.916	0.907
MHAN_WPUPW	非谣言	0.907	0.915	0.900	0.906

模型	类别	准确率	精确率	召回率	F1
DTR	谣言	0.732	0.738	0.715	0.726
DTR	非谣言	0.732	0.726	0.749	0.737
DTC	谣言	0.831	0.847	0.815	0.831
DTC	非谣言	0.831	0.815	0.847	0.830
RFC	谣言	0.849	0.786	0.959	0.864
RFC	非谣言	0.849	0.947	0.739	0.830
SVM-RBF	谣言	0.818	0.822	0.812	0.817
SVM-RBF	非谣言	0.818	0.815	0.824	0.819
SVM-TS	谣言	0.857	0.839	0.885	0.861
SVM-TS	非谣言	0.857	0.878	0.830	0.857
GRU	谣言	0.910	0.876	0.956	0.914
GRU	非谣言	0.910	0.952	0.864	0.906
MHAN	谣言	0.912	0.894	0.899	0.910
MHAN	非谣言	0.912	0.930	0.926	0.914
MHAN_WUW	谣言	0.895	0.934	0.848	0.889
MHAN_WUW	非谣言	0.895	0.863	0.941	0.900
MHAN_WPUPW	谣言	0.907	0.898	0.916	0.907
MHAN_WPUPW	非谣言	0.907	0.915	0.900	0.906

模型	类别	准确率	精确率	召回率	F1
MHAN	谣言	0.871	0.884	0.868	0.876
MHAN	非谣言	0.871	0.857	0.874	0.866
MHAN_WUW	谣言	0.777	0.741	0.824	0.780
MHAN_WUW	非谣言	0.777	0.818	0.733	0.773
MHAN_WPUPW	谣言	0.849	0.852	0.830	0.841
MHAN_WPUPW	非谣言	0.849	0.846	0.867	0.856

基于异构图注意力网络的微博谣言监测模型

Microblog rumor detection model based on heterogeneous graph attention network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 21

相关文章 15

编辑推荐

Metrics

[1]	王梓森, 梁英, 刘政君, 谢小杰, 张伟, 史红周. 科研项目同行评议专家学术专长匹配方法[J]. 计算机应用, 2021, 41(8): 2418-2426.
[2]	林怿星, 唐华. 基于异构信息网络的混合推荐模型[J]. 计算机应用, 2021, 41(5): 1348-1355.
[3]	张蓉, 张献国. 基于层次异构图注意力网络的虚假评论检测[J]. 计算机应用, 2021, 41(5): 1275-1281.
[4]	赵旭剑, 王崇伟. 基于图卷积网络的微博新闻故事线抽取方法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3139-3144.
[5]	李艳红, 赵宏伟, 王素格, 李德玉. 面向微博文本流的负面情感突发话题检测[J]. 计算机应用, 2020, 40(12): 3458-3464.
[6]	徐红艳, 王丹, 王富海, 王嵘冰. 融合潜在狄利克雷分布与元路径分析的用户相关性度量方法[J]. 计算机应用, 2019, 39(11): 3288-3292.
[7]	刘威, 张明新, 安德智. 面向微博话题的用户影响力分析算法[J]. 计算机应用, 2019, 39(1): 213-219.
[8]	赵星宇, 赵志宏, 王业沛, 陈松宇. 基于聚类分析的微博广告发布者识别[J]. 计算机应用, 2018, 38(5): 1267-1271.
[9]	邱庆羽, 李婧, 全兵, 童超, 张利君, 张海仙. 基于文献信息网络语义特征的相似性搜索[J]. 计算机应用, 2018, 38(5): 1327-1333.
[10]	段大高, 盖新新, 韩忠明, 刘冰心. 基于梯度提升决策树的微博虚假消息检测[J]. 计算机应用, 2018, 38(2): 410-414.
[11]	赵军豪, 李玉华, 霍林, 李瑞轩, 辜希武. 融合微博情感分析和深度学习的宏观经济预测方法[J]. 计算机应用, 2018, 38(11): 3057-3062.
[12]	史庆伟, 刘雨诗, 张丰田. 基于微博文本的词对主题演化模型[J]. 计算机应用, 2017, 37(5): 1407-1412.
[13]	刘巧玲, 李劲, 肖人彬. 基于参数反演的网络舆情传播趋势预测——以新浪微博为例[J]. 计算机应用, 2017, 37(5): 1419-1423.
[14]	周霜霜, 徐金安, 陈钰枫, 张玉洁. 融合规则与统计的微博新词发现方法[J]. 计算机应用, 2017, 37(4): 1044-1050.
[15]	刘政, 卫志华, 张韧弦. 基于卷积神经网络的谣言检测[J]. 计算机应用, 2017, 37(11): 3053-3056.