Multimodal sentiment analysis model for modality absence under shared semantic conditions

doi:10.11772/j.issn.1001-9081.2025080959

Abstract

Abstract: Existing studies on multimodal sentiment analysis in modality-absent scenarios often neglect intermodal correlations when generating missing modalities, leading to semantic inconsistencies between restored and original data. Additionally, missing modality generation based on diffusion models incurs significant computational overhead. To address these issues, a multimodal sentiment analysis model for missing modalities under shared semantic conditions (MM-SSC) was proposed. Firstly, a shared latent space mapping module was designed. Leveraging the shared latent space of a Vector Quantization Variational Autoencoder (VQ-VAE), it effectively captures multimodal distributions, ensuring shared semantics across modalities. Subsequently, a cross-modal consistency constraint method was proposed to learn mutual information within each modality's latent space. This promotes the refinement of semantic information between modalities and enhances cross-modal consistency. Next, a missing-modality reconstruction and alignment module was designed to reconstruct and refine missing modalities while reducing computational overhead. Finally, a multimodal fusion and prediction module was introduced to fuse reconstructed and available modalities for consistent sentiment analysis. Experimental results demonstrate that under fixed modality missing conditions, compared with IMDer model, the proposed model achieves average improvements of 0.4 and 0.7 percentage points in the F1 score and ACC7 on the CMU-MOSI dataset, respectively. On the CMU-MOSEI dataset, the F1 score and ACC7 improve by 0.9 and 0.3 percentage points, respectively. Under random modality missing conditions, compared with IMDer model, the proposed model achieves an average improvement of 2.2 percentage points in both F1 score and ACC7 on the CMU-MOSI dataset. On the CMU-MOSEI dataset, the F1 score and ACC7 improve by 0.8 and 0.4 percentage points, respectively. These experimental results demonstrate that MM-SSC can effectively address multimodal sentiment analysis tasks under modality-absent scenarios.

Key words: multimodal sentiment analysis, modality absence, deep learning, vector quantized variational autoencoder(VQ-VAE), diffusion models

摘要： 现有研究在进行模态缺失场景下的多模态情感分析时，其缺失模态生成常忽略模态间的相关性，导致恢复数据与原始数据间的语义不一致，同时基于扩散模型的缺失模态生成会带来较大的计算开销。针对以上问题，提出了一种共享语义条件下面向模态缺失的多模态情感分析模型（MM-SSC）。首先，设计了共享潜空间映射模块，利用矢量量化变分自编码器（VQ-VAE）的共享潜空间有效捕捉多模态分布，确保模态间共享语义；然后，给出了跨模态一致性约束方法，用于学习每种模态潜空间内的互信息，促进模态间语义信息的完善，增强跨模态一致性；接着，设计了缺失模态重建与对齐模块，以实现缺失模态的重建与细化，同时降低重建计算开销；最后，引入多模态融合与预测模块，将重建模态与可用模态进行融合，用于一致性的情感分析。实验结果显示，固定模态缺失的情况下，所提模型相比IMDer模型在CMU-MOSI数据集上的F1得分、七分类准确率（ACC7）指标分别平均提升了0.4与0.7个百分点，在CMU-MOSEI数据集上F1得分、ACC7分别平均提升了0.9与0.3个百分点；随机模态缺失的情况下，所提模型相比IMDer模型在CMU-MOSI数据集上的F1得分、ACC7都平均提升了2.2个百分点，在CMU-MOSEI数据集上F1得分、ACC7分别平均提升了0.8与0.4个百分点。实验结果表明，MM-SSC能有效应对模态缺失场景下的多模态情感分析任务。

关键词: 多模态情感分析, 模态缺失, 深度学习, 矢量量化变分自编码器, 扩散模型

CLC Number:

TP391.4

刘赏汤兆森刘鸿月董林芳周金. 共享语义条件下面向模态缺失的多模态情感分析模型[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025080959.

[1]	Panfeng JING, Yudong LIANG, Chaowei LI, Junru GUO, Jinyu GUO. Semi-supervised image dehazing algorithm based on teacher-student learning [J]. Journal of Computer Applications, 2025, 45(9): 2975-2983.
[2]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[3]	Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016.
[4]	Hongjun ZHANG, Gaojun PAN, Hao YE, Yubin LU, Yiheng MIAO. Multi-source heterogeneous data analysis method combining deep learning and tensor decomposition [J]. Journal of Computer Applications, 2025, 45(9): 2838-2847.
[5]	Jin LI, Liqun LIU. SAR and visible image fusion based on residual Swin Transformer [J]. Journal of Computer Applications, 2025, 45(9): 2949-2956.
[6]	Bing YIN, Zhenhua LING, Yin LIN, Changfeng XI, Ying LIU. Emotion recognition method compatible with missing modal reasoning [J]. Journal of Computer Applications, 2025, 45(9): 2764-2772.
[7]	Lina GE, Mingyu WANG, Lei TIAN. Review of research on efficiency of federated learning [J]. Journal of Computer Applications, 2025, 45(8): 2387-2398.
[8]	Peng PENG, Ziting CAI, Wenling LIU, Caihua CHEN, Wei ZENG, Baolai HUANG. Speech emotion recognition method based on hybrid Siamese network with CNN and bidirectional GRU [J]. Journal of Computer Applications, 2025, 45(8): 2515-2521.
[9]	Shuo ZHANG, Guokai SUN, Yuan ZHUANG, Xiaoyu FENG, Jingzhi WANG. Dynamic detection method of eclipse attacks for blockchain node analysis [J]. Journal of Computer Applications, 2025, 45(8): 2428-2436.
[10]	Xingjie FENG, Xingpeng BIAN, Xiaorong FENG, Xinglong WANG. Incremental missing value imputation algorithm for time series based on diffusion model [J]. Journal of Computer Applications, 2025, 45(8): 2582-2591.
[11]	Yanhua LIAO, Yuanxia YAN, Wenlin PAN. Multi-target detection algorithm for traffic intersection images based on YOLOv9 [J]. Journal of Computer Applications, 2025, 45(8): 2555-2565.
[12]	Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244.
[13]	Jinxian SUO, Liping ZHANG, Sheng YAN, Dongqi WANG, Yawen ZHANG. Review of interpretable deep knowledge tracing methods [J]. Journal of Computer Applications, 2025, 45(7): 2043-2055.
[14]	Zhenzhou WANG, Fangfang GUO, Jingfang SU, He SU, Jianchao WANG. Robustness optimization method of visual model for intelligent inspection [J]. Journal of Computer Applications, 2025, 45(7): 2361-2368.
[15]	Qiaoling QI, Xiaoxiao WANG, Qianqian ZHANG, Peng WANG, Yongfeng DONG. Label noise adaptive learning algorithm based on meta-learning [J]. Journal of Computer Applications, 2025, 45(7): 2113-2122.

Multimodal sentiment analysis model for modality absence under shared semantic conditions

共享语义条件下面向模态缺失的多模态情感分析模型

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics