Journal of Computer Applications
Next Articles
Received:
Revised:
Online:
Published:
刘赏1,汤兆森2,刘鸿月2,董林芳1,周金2
通讯作者:
基金资助:
Abstract: Existing studies on multimodal sentiment analysis in modality-absent scenarios often neglect intermodal correlations when generating missing modalities, leading to semantic inconsistencies between restored and original data. Additionally, missing modality generation based on diffusion models incurs significant computational overhead. To address these issues, a multimodal sentiment analysis model for missing modalities under shared semantic conditions (MM-SSC) was proposed. Firstly, a shared latent space mapping module was designed. Leveraging the shared latent space of a Vector Quantization Variational Autoencoder (VQ-VAE), it effectively captures multimodal distributions, ensuring shared semantics across modalities. Subsequently, a cross-modal consistency constraint method was proposed to learn mutual information within each modality's latent space. This promotes the refinement of semantic information between modalities and enhances cross-modal consistency. Next, a missing-modality reconstruction and alignment module was designed to reconstruct and refine missing modalities while reducing computational overhead. Finally, a multimodal fusion and prediction module was introduced to fuse reconstructed and available modalities for consistent sentiment analysis. Experimental results demonstrate that under fixed modality missing conditions, compared with IMDer model, the proposed model achieves average improvements of 0.4 and 0.7 percentage points in the F1 score and ACC7 on the CMU-MOSI dataset, respectively. On the CMU-MOSEI dataset, the F1 score and ACC7 improve by 0.9 and 0.3 percentage points, respectively. Under random modality missing conditions, compared with IMDer model, the proposed model achieves an average improvement of 2.2 percentage points in both F1 score and ACC7 on the CMU-MOSI dataset. On the CMU-MOSEI dataset, the F1 score and ACC7 improve by 0.8 and 0.4 percentage points, respectively. These experimental results demonstrate that MM-SSC can effectively address multimodal sentiment analysis tasks under modality-absent scenarios.
Key words: multimodal sentiment analysis, modality absence, deep learning, vector quantized variational autoencoder(VQ-VAE), diffusion models
摘要: 现有研究在进行模态缺失场景下的多模态情感分析时,其缺失模态生成常忽略模态间的相关性,导致恢复数据与原始数据间的语义不一致,同时基于扩散模型的缺失模态生成会带来较大的计算开销。针对以上问题,提出了一种共享语义条件下面向模态缺失的多模态情感分析模型(MM-SSC)。首先,设计了共享潜空间映射模块,利用矢量量化变分自编码器(VQ-VAE)的共享潜空间有效捕捉多模态分布,确保模态间共享语义;然后,给出了跨模态一致性约束方法,用于学习每种模态潜空间内的互信息,促进模态间语义信息的完善,增强跨模态一致性;接着,设计了缺失模态重建与对齐模块,以实现缺失模态的重建与细化,同时降低重建计算开销;最后,引入多模态融合与预测模块,将重建模态与可用模态进行融合,用于一致性的情感分析。实验结果显示,固定模态缺失的情况下,所提模型相比IMDer模型在CMU-MOSI数据集上的F1得分、七分类准确率(ACC7)指标分别平均提升了0.4与0.7个百分点,在CMU-MOSEI数据集上F1得分、ACC7分别平均提升了0.9与0.3个百分点;随机模态缺失的情况下,所提模型相比IMDer模型在CMU-MOSI数据集上的F1得分、ACC7都平均提升了2.2个百分点,在CMU-MOSEI数据集上F1得分、ACC7分别平均提升了0.8与0.4个百分点。实验结果表明,MM-SSC能有效应对模态缺失场景下的多模态情感分析任务。
关键词: 多模态情感分析, 模态缺失, 深度学习, 矢量量化变分自编码器, 扩散模型
CLC Number:
TP391.4
刘赏 汤兆森 刘鸿月 董林芳 周金. 共享语义条件下面向模态缺失的多模态情感分析模型[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025080959.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025080959