基于跨模态注意力机制与对比学习的谣言检测方法

doi:10.11772/j.issn.1001-9081.2025030266

《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (2): 361-367.DOI: 10.11772/j.issn.1001-9081.2025030266

• 人工智能 • 上一篇

基于跨模态注意力机制与对比学习的谣言检测方法

罗虎, 张明书()

武警工程大学密码工程学院，西安 710086

收稿日期:2025-03-18 修回日期:2025-06-02 接受日期:2025-06-06 发布日期:2025-07-21 出版日期:2026-02-10
通讯作者: 张明书
作者简介:罗虎（1993—），男，陕西西安人，硕士研究生，主要研究方向：多模态谣言检测
张明书（1978—），男，河南开封人，教授，博士，主要研究方向：网络安全、数据挖掘、社交计算。Email:zms2099@163.com
基金资助:
国家社会科学基金资助项目(20BXW101)

Rumor detection method based on cross-modal attention mechanism and contrastive learning

Hu LUO, Mingshu ZHANG()

School of Cryptographic Engineering，Engineering University of PAP，Xi’an Shaanxi 710086，China

Received:2025-03-18 Revised:2025-06-02 Accepted:2025-06-06 Online:2025-07-21 Published:2026-02-10
Contact: Mingshu ZHANG
About author:LUO Hu， born in 1993， M. S. candidate. His research interests include multi-modal rumor detection.
ZHANG Mingshu， born in 1978， Ph. D.， professor. His research interests include cybersecurity， data mining， social computing. Email:zms2099@163.com
Supported by:
National Social Science Foundation of China(20BXW101)

摘要/Abstract

摘要：

社交媒体多模态谣言检测面临着跨模态特征关联性弱以及数据内在表征不足的挑战。因此，提出一种基于跨模态注意力机制与对比学习的谣言检测方法。该方法通过多模态特征模块提取文本与视觉的细粒度特征，利用跨模态共同注意力机制和差异性学习增强模态间的关联性，运用多头自注意力捕获复杂语义的上下文，并创新性地引入对比学习模块实现机器监督下的特征优化。在Twitter-16和Weibo公开数据集上的实验结果表明，所提方法的准确率较现有的最优模型MMFN（Multi-Modal Fusion Network）分别提升了5.47和4.44个百分点，验证了细颗粒度特征挖掘与跨模态相似性建模对提升检测性能的关键作用。可见，深度解析多模态内容差异和强化跨模态关联机制能有效提升社交媒体谣言的识别精度。

关键词: 跨模态, 自注意力机制, 对比学习, 多模态, 谣言检测方法

Abstract:

Social media multi-modal rumor detection faces challenges such as weak cross-modal feature correlation and insufficient intrinsic representation of data. Therefore， a rumor detection method based on cross-modal attention mechanism and contrastive learning was proposed. In the method， fine-grained features of text and vision were extracted by a multi-modal feature module， cross-modal co-attention mechanism and discriminative learning were utilized to enhance inter-modal correlation， complex semantic contexts were captured by using multi-head self-attention， and a contrastive learning module was introduced innovatively to achieve feature optimization under machine supervision. Experimental results on the public Twitter-16 and Weibo datasets show that the accuracy of the proposed method is improved by 5.47 and 4.44 percentage points， respectively， compared with that of the existing optimal model MMFN （Multi-Modal Fusion Network）， verifying the key roles of fine-grained feature mining and cross-modal similarity modeling in detection performance. It can be seen that analyzing multi-modal content differences deeply and strengthening cross-modal association mechanism can improve the recognition accuracy of social media rumors effectively.

Key words: cross-modal, self-attention mechanism, contrastive learning, multi-modal, rumor detection method

中图分类号:

TP391.1

罗虎, 张明书. 基于跨模态注意力机制与对比学习的谣言检测方法[J]. 计算机应用, 2026, 46(2): 361-367.

Hu LUO, Mingshu ZHANG. Rumor detection method based on cross-modal attention mechanism and contrastive learning[J]. Journal of Computer Applications, 2026, 46(2): 361-367.

图/表 7

参考文献 32

[1]	彭甜，刘岩芳.信息传播视域下重大突发事件网络谣言传播特点及治理对策研究［J］.新闻研究导刊，2023， 14（18）： 19-21.
	PENG T， LIU Y F. Research on the dissemination characteristics and governance strategies of online rumors in major emergencies from the perspective of information diffusion［J］. Journal of News Research， 2023， 14（18）： 19-21.
[2]	VOSOUGHI S， ROY D， ARAL S. The spread of true and false news online［J］. Science， 2018， 359（6380）： 1146-1151.
[3]	CASTILLO C， MENDOZA M， POBLETE B. Information credibility on Twitter［C］// Proceedings of the 20th International Conference on World Wide Web. New York： ACM， 2011： 675-684.
[4]	LIU X， NOURBAKHSH A， LI Q， et al. Real-time rumor debunking on Twitter［C］// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York： ACM， 2015： 1867-1870.
[5]	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
[6]	WANG Y， MA F， JIN Z， et al. EANN： event adversarial neural networks for multi-modal fake news detection［C］// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2018： 849-857.
[7]	SINGHAL S， SHAH R R， CHAKRABORTY T， et al. SpotFake： a multi-modal framework for fake news detection［C］// Proceedings of the IEEE 5th International Conference on Multimedia Big Data. Piscataway： IEEE， 2019： 39-47.
[8]	SUN T， QIAN Z， DONG S， et al. Rumor detection on social media with graph adversarial contrastive learning［C］// Proceedings of the ACM Web Conference 2022. New York： ACM， 2022： 2789-2797.
[9]	YI F， LIU H， HE H， et al. A comparative analysis of active learning for rumor detection on social media platforms［J］. Applied Sciences， 2023， 13（22）： No.12098.
[10]	ZHANG X， CAO J， LI X， et al. Mining dual emotion for fake news detection［C］// Proceedings of the 2021 Web Conference. New York： ACM， 2021： 3465-3476.
[11]	MA J， GAO W， MITRA P， et al. Detecting rumors from microblogs with recurrent neural networks［C］// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2016： 3818-3824.
[12]	CHEN W， ZHANG Y， YEO C K， et al. Unsupervised rumor detection based on users’ behaviors using neural networks［J］. Pattern Recognition Letters， 2018， 105： 226-233.
[13]	MA J， GAO W， WONG K F. Rumor detection on Twitter with tree-structured recursive neural networks［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2018： 1980-1989.
[14]	ZHANG X， PAN Y， GU X， et al. Sentiment analysis-based social network rumor detection model with bi-directional graph convolutional networks［C］// Proceedings of the SPIE 12609， International Conference on Computer Application and Information Security. Bellingham， WA： SPIE， 2023： No.126091N.
[15]	LIANG G， HE W， XU C， et al. Rumor identification in microblogging systems based on users’ behavior［J］. IEEE Transactions on Computational Social Systems， 2015， 2（3）： 99-108.
[16]	BIAN T， XIAO X， XU T， et al. Rumor detection on social media with bi-directional graph convolutional networks［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 549-556.
[17]	WU Z， PI D， CHEN J， et al. Rumor detection based on propagation graph neural network with attention mechanism［J］. Expert Systems with Applications， 2020， 158： No.113595.
[18]	徐建民，孙朋，吴树芳. 传播路径树核学习的微博谣言检测方法［J］. 计算机科学， 2022， 49（6）： 342-349.
	XU J M， SUN P， WU S F. Microblog rumor detection method based on propagation path tree kernel learning［J］. Computer Science， 2022， 49（6）： 342-349.
[19]	JIN Z， CAO J， GUO H， et al. Multimodal fusion with recurrent neural networks for rumor detection on microblogs［C］// Proceedings of the 25th ACM International Conference on Multimedia. New York： ACM， 2017： 795-816.
[20]	LIU P， QIU X， CHEN X， et al. Multi-timescale long short-term memory neural network for modelling sentences and documents［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2015： 2326-2335.
[21]	KHATTAR D， GOUD J S， GUPTA M， et al. MVAE： multimodal variational autoencoder for fake news detection［C］// Proceedings of the World Wide Web Conference. New York： ACM， 2019： 2915-2921.
[22]	SINGHAL S， KABRA A， SHARMA M， et al. SpotFake+： a multimodal framework for fake news detection via transfer learning （student abstract）［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 13915-13916.
[23]	CHEN Y， LI D， ZHANG P， et al. Cross-modal ambiguity learning for multimodal fake news detection［C］// Proceedings of the ACM Web Conference 2022. New York： ACM， 2022： 2897-2905.
[24]	ZHENG J， ZHANG X， GUO S， et al. MFAN： multi-modal feature-enhanced attention networks for rumor detection［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. California： ijcai.org， 2022： 2413-2419.
[25]	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
[26]	CHEN T， KORNBLITH S， SWERSKY K， et al. Big self-supervised models are strong semi-supervised learners［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 22243-22255.
[27]	BOIDIDOU C， PAPADOPOULOS S， ZAMPOGLOU M， et al. Detection and visualization of misleading content on Twitter［J］. International Journal of Multimedia Information Retrieval， 2018， 7（1）： 71-86.
[28]	KINGMA D P， BA J L. Adam： a method for stochastic optimization［EB/OL］. ［2024-12-23］..
[29]	ZHOU X， WU J， ZAFARANI R. SAFE： similarity-aware multi-modal fake news detection［C］// Proceedings of the 2020 Pacific-Asia Conference on Knowledge Discovery and Data Mining， LNCS 12085. Cham： Springer， 2020： 354-367.
[30]	DOU Y， SHU K， XIA C， et al. User preference-aware fake news detection［C］// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2021： 2051-2055.
[31]	ZHOU Y， YANG Y， YING Q， et al. Multi-modal fake news detection on social media via multi-grained information fusion［C］// Proceedings of the 2023 ACM International Conference on Multimedia Retrieval. New York： ACM， 2023： 343-352.
[32]	WU G， WANG B， LI X， et al. Contrastive learning based on feature enhancement for multi-modal fake news detection［C］// Proceedings of the 43rd Chinese Control Conference. Piscataway： IEEE， 2024： 7610-7615.

数据集	谣言数	非谣言数	合计
Weibo	3 749	3 783	7 532
Twitter-16	5 007	6 840	11 847

数据集	谣言数	非谣言数	合计
Weibo	3 749	3 783	7 532
Twitter-16	5 007	6 840	11 847

模型	准确率	精确度	召回率	F1
BERT	67.75	68.73	66.52	67.61
ResNet-50	65.32	65.27	64.63	64.95
att-RNN	75.83	75.82	75.27	75.54
EANN	80.93	80.16	79.64	79.93
SAFE	85.02	85.05	85.02	85.03
UPFD	83.26	84.35	84.37	84.36
CAFE	84.67	84.65	84.39	84.52
MMFN	86.66	87.53	85.36	86.43
CONLFE	86.26	88.56	83.01	85.70
CACL	91.10	89.51	92.95	91.20

模型	准确率	精确度	召回率	F1
BERT	67.75	68.73	66.52	67.61
ResNet-50	65.32	65.27	64.63	64.95
att-RNN	75.83	75.82	75.27	75.54
EANN	80.93	80.16	79.64	79.93
SAFE	85.02	85.05	85.02	85.03
UPFD	83.26	84.35	84.37	84.36
CAFE	84.67	84.65	84.39	84.52
MMFN	86.66	87.53	85.36	86.43
CONLFE	86.26	88.56	83.01	85.70
CACL	91.10	89.51	92.95	91.20

模型	准确率	精确度	召回率	F1
BERT	53.23	59.92	54.25	56.94
ResNet-50	59.25	67.32	51.37	58.27
att-RNN	68.35	72.32	68.27	70.24
EANN	77.15	76.69	72.36	74.46
SAFE	82.37	81.25	81.07	81.16
UPFD	86.34	85.37	85.24	85.30
CAFE	84.39	84.84	83.44	84.13
MMFN	85.33	83.68	87.95	85.76
CONLFE	85.29	88.32	81.06	84.54
CACL	90.80	89.58	92.17	90.86

基于跨模态注意力机制与对比学习的谣言检测方法

Rumor detection method based on cross-modal attention mechanism and contrastive learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 32

相关文章 15

编辑推荐

Metrics

数据集	模型	准确率	精确度	召回率	F1
Weibo	CACL_T	74.26	74.35	74.42	74.38
	CACL_V	75.35	74.62	74.47	74.54
	CACL_M	85.48	85.32	85.25	85.28
	CACL_MS	90.23	90.27	90.26	90.26
	CACL	91.10	89.51	92.95	91.20
Twitter-16	CACL_T	72.27	72.32	72.25	72.28
	CACL_V	78.62	78.37	78.63	78.50
	CACL_M	84.45	84.32	84.58	84.45
	CACL_MS	89.45	89.37	89.48	89.43
	CACL	90.80	89.58	92.17	90.86

[1]	王雪, 张丽萍, 闫盛, 李娜, 张学飞. 多模态知识图谱补全方法综述[J]. 《计算机应用》唯一官方网站, 2026, 46(2): 341-353.
[2]	程梓洋, 黄瑞章, 薛菁菁. 深度演化主题聚类模型[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 85-94.
[3]	王菲, 陶冶, 刘家旺, 李伟, 秦修功, 张宁. 面向智慧家庭空间的时空知识图谱的双模态融合构建方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 52-59.
[4]	李玟, 李开荣, 杨凯. 基于数据增强的子图感知对比学习[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 1-9.
[5]	杨兴耀, 齐正, 于炯, 张祖莲, 马帅, 沈洪涛. 时间感知和空间增强的双通道图神经网络会话推荐模型[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 104-112.
[6]	黄舒雯, 郭柯宇, 宋翔宇, 韩锋, 孙士杰, 宋焕生. 基于单目图像的多目标三维视觉定位方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 207-215.
[7]	李亚男, 郭梦阳, 邓国军, 陈允峰, 任建吉, 原永亮. 基于多模态融合特征的并分支发动机寿命预测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 305-313.
[8]	刘超, 余岩化. 融合降噪策略与多视图对比学习的知识感知推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2827-2837.
[9]	邓伊琳, 余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2893-2901.
[10]	黄锦阳, 崔丰麒, 马长秀, 樊文东, 李萌, 李经宇, 孙晓, 黄林生, 刘志. 基于通用手环的睡眠呼吸暂停检测[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3045-3056.
[11]	王翔, 陈志祥, 毛国君. 融合局部和全局相关性的多变量时间序列预测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2806-2816.
[12]	许志雄, 李波, 边小勇, 胡其仁. 对抗样本嵌入注意力U型网络的3D医学图像分割[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3011-3016.
[13]	殷兵, 凌震华, 林垠, 奚昌凤, 刘颖. 兼容缺失模态推理的情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2764-2772.
[14]	王祉苑, 彭涛, 杨捷. 分布外检测中训练与测试的内外数据整合[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2497-2506.
[15]	习怡萌, 邓箴, 刘倩, 刘立波. 跨模态信息融合的视频-文本检索[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2448-2456.