Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (5): 1432-1438.DOI: 10.11772/j.issn.1001-9081.2024050731
• 2024 China Granular Computing and Knowledge Discovery Conference (CGCKD2024) • Previous Articles
Wenbin HU, Tianxiang CAI(), Tianle HAN, Zhaoman ZHONG, Changxia MA
Received:
2024-06-03
Revised:
2024-07-02
Accepted:
2024-07-05
Online:
2024-07-25
Published:
2025-05-10
Contact:
Tianxiang CAI
About author:
HU Wenbin, born in 1976, Ph. D., associate professor. Her research interests include personal privacy protection, social network analysis, pattern recognition.Supported by:
通讯作者:
蔡天翔
作者简介:
胡文彬(1976—),女,江苏连云港人,副教授,博士,CCF会员,主要研究方向:个人隐私保护、社交网络分析、模式识别基金资助:
CLC Number:
Wenbin HU, Tianxiang CAI, Tianle HAN, Zhaoman ZHONG, Changxia MA. Multimodal sarcasm detection model integrating contrastive learning with sentiment analysis[J]. Journal of Computer Applications, 2025, 45(5): 1432-1438.
胡文彬, 蔡天翔, 韩天乐, 仲兆满, 马常霞. 融合对比学习与情感分析的多模态反讽检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1432-1438.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024050731
样本集 | 样本数 | ||
---|---|---|---|
反讽 | 非反讽 | 累计 | |
总和 | 10 647 | 13 988 | 24 635 |
训练集 | 8 642 | 11 174 | 19 816 |
验证集 | 1 000 | 1 410 | 2 410 |
测试集 | 1 005 | 1 404 | 2 409 |
Tab. 1 Twitter dataset used in experiments
样本集 | 样本数 | ||
---|---|---|---|
反讽 | 非反讽 | 累计 | |
总和 | 10 647 | 13 988 | 24 635 |
训练集 | 8 642 | 11 174 | 19 816 |
验证集 | 1 000 | 1 410 | 2 410 |
测试集 | 1 005 | 1 404 | 2 409 |
模态 | 模型 | 准确率 | 精确率 | 召回率 | F1 |
---|---|---|---|---|---|
图像 | ResNet | 0.702 8 | 0.716 6 | 0.644 0 | 0.678 3 |
ViT | 0.732 6 | 0.676 4 | 0.688 5 | 0.682 4 | |
文本 | BiLSTM | 0.735 5 | 0.692 4 | 0.658 7 | 0.675 1 |
BERT | 0.793 6 | 0.749 5 | 0.759 2 | 0.754 3 | |
多模态(图像+文本) | TextCNN-ResNet | 0.775 4 | 0.720 8 | 0.709 2 | 0.715 0 |
BERT-LSTM-ResNet | 0.756 0 | 0.702 4 | 0.715 0 | 0.708 6 | |
VLMo-base | 0.837 3 | 0.802 0 | 0.810 0 | 0.805 9 | |
Res-BERT | 0.824 2 | 0.768 7 | 0.807 7 | 0.787 7 | |
ALBEF | 0.829 8 | 0.799 7 | 0.787 0 | 0.793 3 | |
InCrossMGs | 0.832 1 | 0.732 4 | 0.821 7 | 0.773 5 | |
HFM | 0.834 4 | 0.765 7 | 0.798 6 | 0.801 8 | |
D&R Net | 0.840 2 | 0.779 7 | 0.834 2 | 0.806 0 | |
MSDCS | 0.855 8 | 0.810 8 | 0.833 8 | 0.822 1 |
Tab. 2 Comparison of evaluation metrics for experimental models
模态 | 模型 | 准确率 | 精确率 | 召回率 | F1 |
---|---|---|---|---|---|
图像 | ResNet | 0.702 8 | 0.716 6 | 0.644 0 | 0.678 3 |
ViT | 0.732 6 | 0.676 4 | 0.688 5 | 0.682 4 | |
文本 | BiLSTM | 0.735 5 | 0.692 4 | 0.658 7 | 0.675 1 |
BERT | 0.793 6 | 0.749 5 | 0.759 2 | 0.754 3 | |
多模态(图像+文本) | TextCNN-ResNet | 0.775 4 | 0.720 8 | 0.709 2 | 0.715 0 |
BERT-LSTM-ResNet | 0.756 0 | 0.702 4 | 0.715 0 | 0.708 6 | |
VLMo-base | 0.837 3 | 0.802 0 | 0.810 0 | 0.805 9 | |
Res-BERT | 0.824 2 | 0.768 7 | 0.807 7 | 0.787 7 | |
ALBEF | 0.829 8 | 0.799 7 | 0.787 0 | 0.793 3 | |
InCrossMGs | 0.832 1 | 0.732 4 | 0.821 7 | 0.773 5 | |
HFM | 0.834 4 | 0.765 7 | 0.798 6 | 0.801 8 | |
D&R Net | 0.840 2 | 0.779 7 | 0.834 2 | 0.806 0 | |
MSDCS | 0.855 8 | 0.810 8 | 0.833 8 | 0.822 1 |
实验 | 准确率 | 精确率 | 召回率 | F1 |
---|---|---|---|---|
MSDCS | 0.855 8 | 0.810 8 | 0.833 8 | 0.822 1 |
w/o-S | 0.837 3 | 0.802 0 | 0.810 0 | 0.805 9 |
w/o-E | 0.826 8 | 0.808 4 | 0.798 0 | 0.803 2 |
w/o-S-E | 0.817 3 | 0.793 8 | 0.785 1 | 0.789 4 |
w/o-M | 0.801 9 | 0.777 6 | 0.764 3 | 0.770 9 |
Tab. 3 Ablation experimental results
实验 | 准确率 | 精确率 | 召回率 | F1 |
---|---|---|---|---|
MSDCS | 0.855 8 | 0.810 8 | 0.833 8 | 0.822 1 |
w/o-S | 0.837 3 | 0.802 0 | 0.810 0 | 0.805 9 |
w/o-E | 0.826 8 | 0.808 4 | 0.798 0 | 0.803 2 |
w/o-S-E | 0.817 3 | 0.793 8 | 0.785 1 | 0.789 4 |
w/o-M | 0.801 9 | 0.777 6 | 0.764 3 | 0.770 9 |
数据集 | 模型 | 准确率 | Macro-F1 |
---|---|---|---|
Twitter-15 | MSDCS | 0.789 8 | 0.770 8 |
TomBERT | 0.761 8 | 0.712 7 | |
Twitter-17 | MSDCS | 0.719 8 | 0.687 3 |
TomBERT | 0.705 0 | 0.680 4 |
Tab. 4 Sentiment categorized results
数据集 | 模型 | 准确率 | Macro-F1 |
---|---|---|---|
Twitter-15 | MSDCS | 0.789 8 | 0.770 8 |
TomBERT | 0.761 8 | 0.712 7 | |
Twitter-17 | MSDCS | 0.719 8 | 0.687 3 |
TomBERT | 0.705 0 | 0.680 4 |
1 | LIU H, WEI R, TU G, et al. Sarcasm driven by sentiment: a sentiment-aware hierarchical fusion network for multimodal sarcasm detection[J]. Information Fusion, 2024, 108: No. 102353. |
2 | GONZÁLEZ-IBÁÑEZ R, MURESAN S, WACHOLDER N. Identifying sarcasm in Twitter: a closer look[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2011: 581-586. |
3 | LUNANDO E, PURWARIANTI A. Indonesian social media sentiment analysis with sarcasm detection[C]// Proceedings of the 2013 International Conference on Advanced Computer Science and Information Systems. Piscataway: IEEE, 2013: 195-198. |
4 | JOSHI A, SHARMA V, BHATTACHARYYA P. Harnessing context incongruity for sarcasm detection[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Stroudsburg: ACL, 2015: 757-762. |
5 | CAI Y, CAI H, WAN X. Multi-modal sarcasm detection in twitter with hierarchical fusion model[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 2506-2515. |
6 | SAVINI E, CARAGEA C. Intermediate-task transfer learning with BERT for sarcasm detection[J]. Mathematics, 2022, 10(5): No. 844. |
7 | VITMAN O, KOSTIUK Y, SIDOROV G, et al. Sarcasm detection framework using context, emotion and sentiment features[J]. Expert Systems with Applications, 2023, 234: No. 121068. |
8 | ILIĆ S, MARRESE-TAYLOR E, BALAZS J A, et al. Deep contextualized word representations for detecting sarcasm and irony[C]// Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Stroudsburg: ACL, 2018: 2-7. |
9 | MAJUMDER N, PORIA S, PENG H, et al. Sentiment and sarcasm classification with multitask learning[J]. IEEE Intelligent Systems, 2019, 34(3): 38-43. |
10 | RAZALI M S, HALIN A A, NOROWI N M, et al. The importance of multimodality in sarcasm detection for sentiment analysis[C]// Proceedings of the 2017 IEEE 15th Student Conference on Research and Development. Piscataway: IEEE, 2017: 56-60. |
11 | BOUAZIZI M, OHTSUKI T. Sarcasm detection in Twitter: “all your products are incredibly amazing!!!” — are they really?[C]// Proceedings of the 2015 IEEE Global Communications Conference. Piscataway: IEEE, 2015: 1-6. |
12 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2024-06-02]. . |
13 | RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]// Proceedings of the 38th International Conference on Machine Learning. New York: PMLR, 2021: 8748-8763. |
14 | KIM W, SON B, KIM I. ViLT: vision-and-language Transformer without convolution or region supervision[C]// Proceedings of the 38th International Conference on Machine Learning. New York: PMLR, 2021: 5583-5594. |
15 | LI J, SELVARAJU R R, GOTMARE A D, et al. Align before fuse: vision and language representation learning with momentum distillation[C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 9694-9705. |
16 | BAO H, WANG W, DONG L, et al. VLMo: unified vision-language pre-training with mixture-of-modality-experts[C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 32897-32912. |
17 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
18 | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. |
19 | JIA M, XIE C, JING L. Debiasing multimodal sarcasm detection with contrastive learning[EB/OL]. [2024-07-04]. . |
20 | LIU Y, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[C]// Proceedings of the 20th Chinese National Conference on Computational Linguistics. Beijing: Chinese Information Processing Society of China, 2021: 1218-1227. |
21 | DEMSZKY D, MOVSHOVITZ-ATTIAS D, KO J, et al. GoEmotions: a dataset of fine-grained emotions[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 4040-4054. |
22 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
23 | ZHANG S, ZHENG D, HU X, et al. Bidirectional long short-term memory networks for relation classification[C]// Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation. Stroudsburg: ACL, 2015: 73-78. |
24 | PAN H, LIN Z, FU P, et al. Modeling intra and inter-modality incongruity for multi-modal sarcasm detection[C]// Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 1383-1392. |
25 | LIANG B, LOU C, LI X, et al. Multi-modal sarcasm detection with interactive in-modal and cross-modal graphs[C]// Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 4707-4715. |
26 | XU N, ZENG Z, MAO W. Reasoning with multimodal sarcastic tweets via modeling cross-modality contrast and semantic association[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 3777-3786. |
27 | YU J, JIANG J. Adapting BERT for target-oriented multimodal sentiment classification[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc., 2019: 5408-5414. |
[1] | Yufei LONG, Yuchen MOU, Ye LIU. Multi-source data representation learning model based on tensorized graph convolutional network and contrastive learning [J]. Journal of Computer Applications, 2025, 45(5): 1372-1378. |
[2] | Chun XU, Shuangyan JI, Huan MA, Enwei SUN, Mengmeng WANG, Mingyu SU. Consultation recommendation method based on knowledge graph and dialogue structure [J]. Journal of Computer Applications, 2025, 45(4): 1157-1168. |
[3] | Weichao DANG, Xinyu WEN, Gaimei GAO, Chunxia LIU. Multi-view and multi-scale contrastive learning for graph collaborative filtering [J]. Journal of Computer Applications, 2025, 45(4): 1061-1068. |
[4] | Renjie TIAN, Mingli JING, Long JIAO, Fei WANG. Recommendation algorithm of graph contrastive learning based on hybrid negative sampling [J]. Journal of Computer Applications, 2025, 45(4): 1053-1060. |
[5] | Wei CHEN, Changyong SHI, Chuanxiang MA. Crop disease recognition method based on multi-modal data fusion [J]. Journal of Computer Applications, 2025, 45(3): 840-848. |
[6] | Yuanlong WANG, Tinghua LIU, Hu ZHANG. Commonsense question answering model based on cross-modal contrastive learning [J]. Journal of Computer Applications, 2025, 45(3): 732-738. |
[7] | Sheng YANG, Yan LI. Contrastive knowledge distillation method for object detection [J]. Journal of Computer Applications, 2025, 45(2): 354-361. |
[8] | Xiaosheng YU, Zhixin WANG. Sequential recommendation model based on multi-level graph contrastive learning [J]. Journal of Computer Applications, 2025, 45(1): 106-114. |
[9] | Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710. |
[10] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[11] | Huanliang SUN, Siyi WANG, Junling LIU, Jingke XU. Help-seeking information extraction model for flood event in social media data [J]. Journal of Computer Applications, 2024, 44(8): 2437-2445. |
[12] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[13] | Xiaoxia JIANG, Ruizhang HUANG, Ruina BAI, Lina REN, Yanping CHEN. Deep event clustering method based on event representation and contrastive learning [J]. Journal of Computer Applications, 2024, 44(6): 1734-1742. |
[14] | Tianci KE, Jianhua LIU, Shuihua SUN, Zhixiong ZHENG, Zijie CAI. Aspect-level sentiment analysis model combining strong association dependency and concise syntax [J]. Journal of Computer Applications, 2024, 44(6): 1786-1795. |
[15] | Jiong WANG, Taotao TANG, Caiyan JIA. PAGCL: positive augmentation graph contrastive learning recommendation method without negative sampling [J]. Journal of Computer Applications, 2024, 44(5): 1485-1492. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||