《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1432-1438.DOI: 10.11772/j.issn.1001-9081.2024050731

• 2024年中国粒计算与知识发现学术会议 • 上一篇    

融合对比学习与情感分析的多模态反讽检测模型

胡文彬, 蔡天翔(), 韩天乐, 仲兆满, 马常霞   

  1. 江苏海洋大学 计算机工程学院,江苏 连云港 222005
  • 收稿日期:2024-06-03 修回日期:2024-07-02 接受日期:2024-07-05 发布日期:2024-07-25 出版日期:2025-05-10
  • 通讯作者: 蔡天翔
  • 作者简介:胡文彬(1976—),女,江苏连云港人,副教授,博士,CCF会员,主要研究方向:个人隐私保护、社交网络分析、模式识别
    蔡天翔(2000—),男,湖北随州人,硕士研究生,主要研究方向:舆情分析、反讽识别
    韩天乐(2000—),男,江苏南京人,硕士研究生,主要研究方向:舆情分析、情感分析
    仲兆满(1977—),男,江苏连云港人,教授,博士,CCF会员,主要研究方向:人工智能、自然语言处理、大数据采集与分析、社交网络分析
    马常霞(1975—),女,江苏连云港人,副教授,博士,CCF会员,主要研究方向:模式识别与智能系统、机器学习。
  • 基金资助:
    国家自然科学基金资助项目(72174079)

Multimodal sarcasm detection model integrating contrastive learning with sentiment analysis

Wenbin HU, Tianxiang CAI(), Tianle HAN, Zhaoman ZHONG, Changxia MA   

  1. School of Computer Engineering,Jiangsu Ocean University,Lianyungang Jiangsu 222005,China
  • Received:2024-06-03 Revised:2024-07-02 Accepted:2024-07-05 Online:2024-07-25 Published:2025-05-10
  • Contact: Tianxiang CAI
  • About author:HU Wenbin, born in 1976, Ph. D., associate professor. Her research interests include personal privacy protection, social network analysis, pattern recognition.
    CAI Tianxiang, born in 2000, M. S. candidate. His research interests include public opinion analysis, sarcasm detection.
    HAN Tianle, born in 2000, M. S. candidate. His research interests include public opinion analysis, sentiment analysis.
    ZHONG Zhaoman, born in 1977, Ph. D., professor. His research interests include artificial intelligence, natural language processing, big data collection and analysis, social network analysis.
    MA Changxia, born in 1975, Ph. D., associate professor. Her research interests include pattern recognition and intelligent system, machine learning.
  • Supported by:
    National Natural Science Foundation of China(72174079)

摘要:

社交媒体平台上的评论有时会通过反讽来表达对事件的态度,通过反讽检测,可以更准确地分析用户情绪和观点。针对基于词汇和句法结构的传统模型忽略了文本情感信息对反讽检测的作用和由于数据噪声造成的检测性能降低等问题,提出一个融合对比学习和情感分析的多模态反讽检测模型(MSDCS)。首先,利用BERT(Bidirectional Encoder Representation from Transformers)提取文本特征,并利用ViT(Vision Transformer)提取图像特征;其次,利用对比学习中的对比损失训练浅层模型,在融合之前对齐图像和文本特征;最后,结合跨模态特征与情感特征融合后的结果作分类判断,最大限度地利用不同模态间信息实现反讽检测。在多模态反讽检测开放数据集上的实验结果表明,相较于基于分解和关系网络(D&R Net)的基准模型,MSDCS的准确率和F1值至少提高了1.85%和1.99%,验证了在多模态反讽检测中利用情感信息和对比学习的有效性。

关键词: 社交媒体, 反讽检测, 情感分析, 对比学习, 动量蒸馏

Abstract:

Comments on social media platforms sometimes express their attitudes towards events through sarcasm. Sarcasm detection can more accurately analyze user sentiments and opinions. But traditional models based on vocabulary and syntactic structure ignore the role of text sentiment information in sarcasm detection and suffer from performance degradation due to data noise. To address these limitations, a Multimodal Sarcasm Detection model integrating Contrastive learning with Sentiment analysis (MSDCS) was proposed. Firstly, BERT (Bidirectional Encoder Representation from Transformers) was used to extract text features, and ViT (Vision Transformer) was used to extract image features. Then, the contrastive loss in contrastive learning was employed to train a shallow model, and the image and text features were aligned before fusion. Finally, the cross-modal features were combined with the sentiment features to make classification judgments, and the use of information between different modalities was maximized to achieve sarcasm detection. Experimental results on the open dataset of multimodal sarcasm detection show that the accuracy and F1 value of MSDCS are at least 1.85% and 1.99% higher than those of the baseline model Decomposition and Relation Network (D&R Net), verifying the effectiveness of using sentiment information and contrastive learning in multimodal sarcasm detection.

Key words: social media, sarcasm detection, sentiment analysis, contrastive learning, momentum distillation

中图分类号: