Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multimodal sarcasm detection model integrating contrastive learning with sentiment analysis
Wenbin HU, Tianxiang CAI, Tianle HAN, Zhaoman ZHONG, Changxia MA
Journal of Computer Applications    2025, 45 (5): 1432-1438.   DOI: 10.11772/j.issn.1001-9081.2024050731
Abstract44)   HTML4)    PDF (1779KB)(14)       Save

Comments on social media platforms sometimes express their attitudes towards events through sarcasm. Sarcasm detection can more accurately analyze user sentiments and opinions. But traditional models based on vocabulary and syntactic structure ignore the role of text sentiment information in sarcasm detection and suffer from performance degradation due to data noise. To address these limitations, a Multimodal Sarcasm Detection model integrating Contrastive learning with Sentiment analysis (MSDCS) was proposed. Firstly, BERT (Bidirectional Encoder Representation from Transformers) was used to extract text features, and ViT (Vision Transformer) was used to extract image features. Then, the contrastive loss in contrastive learning was employed to train a shallow model, and the image and text features were aligned before fusion. Finally, the cross-modal features were combined with the sentiment features to make classification judgments, and the use of information between different modalities was maximized to achieve sarcasm detection. Experimental results on the open dataset of multimodal sarcasm detection show that the accuracy and F1 value of MSDCS are at least 1.85% and 1.99% higher than those of the baseline model Decomposition and Relation Network (D&R Net), verifying the effectiveness of using sentiment information and contrastive learning in multimodal sarcasm detection.

Table and Figures | Reference | Related Articles | Metrics