Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (2): 361-367.DOI: 10.11772/j.issn.1001-9081.2025030266

• Artificial intelligence • Previous Articles    

Rumor detection method based on cross-modal attention mechanism and contrastive learning

Hu LUO, Mingshu ZHANG()   

  1. School of Cryptographic Engineering,Engineering University of PAP,Xi’an Shaanxi 710086,China
  • Received:2025-03-18 Revised:2025-06-02 Accepted:2025-06-06 Online:2025-07-21 Published:2026-02-10
  • Contact: Mingshu ZHANG
  • About author:LUO Hu, born in 1993, M. S. candidate. His research interests include multi-modal rumor detection.
    ZHANG Mingshu, born in 1978, Ph. D., professor. His research interests include cybersecurity, data mining, social computing. Email:zms2099@163.com
  • Supported by:
    National Social Science Foundation of China(20BXW101)

基于跨模态注意力机制与对比学习的谣言检测方法

罗虎, 张明书()   

  1. 武警工程大学 密码工程学院,西安 710086
  • 通讯作者: 张明书
  • 作者简介:罗虎(1993—),男,陕西西安人,硕士研究生,主要研究方向:多模态谣言检测
    张明书(1978—),男,河南开封人,教授,博士,主要研究方向:网络安全、数据挖掘、社交计算。Email:zms2099@163.com
  • 基金资助:
    国家社会科学基金资助项目(20BXW101)

Abstract:

Social media multi-modal rumor detection faces challenges such as weak cross-modal feature correlation and insufficient intrinsic representation of data. Therefore, a rumor detection method based on cross-modal attention mechanism and contrastive learning was proposed. In the method, fine-grained features of text and vision were extracted by a multi-modal feature module, cross-modal co-attention mechanism and discriminative learning were utilized to enhance inter-modal correlation, complex semantic contexts were captured by using multi-head self-attention, and a contrastive learning module was introduced innovatively to achieve feature optimization under machine supervision. Experimental results on the public Twitter-16 and Weibo datasets show that the accuracy of the proposed method is improved by 5.47 and 4.44 percentage points, respectively, compared with that of the existing optimal model MMFN (Multi-Modal Fusion Network), verifying the key roles of fine-grained feature mining and cross-modal similarity modeling in detection performance. It can be seen that analyzing multi-modal content differences deeply and strengthening cross-modal association mechanism can improve the recognition accuracy of social media rumors effectively.

Key words: cross-modal, self-attention mechanism, contrastive learning, multi-modal, rumor detection method

摘要:

社交媒体多模态谣言检测面临着跨模态特征关联性弱以及数据内在表征不足的挑战。因此,提出一种基于跨模态注意力机制与对比学习的谣言检测方法。该方法通过多模态特征模块提取文本与视觉的细粒度特征,利用跨模态共同注意力机制和差异性学习增强模态间的关联性,运用多头自注意力捕获复杂语义的上下文,并创新性地引入对比学习模块实现机器监督下的特征优化。在Twitter-16和Weibo公开数据集上的实验结果表明,所提方法的准确率较现有的最优模型MMFN(Multi-Modal Fusion Network)分别提升了5.47和4.44个百分点,验证了细颗粒度特征挖掘与跨模态相似性建模对提升检测性能的关键作用。可见,深度解析多模态内容差异和强化跨模态关联机制能有效提升社交媒体谣言的识别精度。

关键词: 跨模态, 自注意力机制, 对比学习, 多模态, 谣言检测方法

CLC Number: