Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (5): 1268-1274.DOI: 10.11772/j.issn.1001-9081.2020071092

Special Issue: 人工智能

• Artificial intelligence • Previous Articles     Next Articles

Multimodal sentiment analysis based on feature fusion of attention mechanism-bidirectional gated recurrent unit

LAI Xuemei1,2, TANG Hong1,2, CHEN Hongyu1,2, LI Shanshan1,2   

  1. 1. School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
    2. Chongqing Key Laboratory of Mobile Communications Technology(Chongqing University of Posts and Telecommunications), Chongqing 400065, China
  • Received:2020-07-23 Revised:2020-09-09 Online:2021-05-10 Published:2021-01-19
  • Supported by:
    This work is partially supported by the Program for Changjiang Scholars and Innovative Research Teams in Universities (IRT_16R72).


赖雪梅1,2, 唐宏1,2, 陈虹羽1,2, 李珊珊1,2   

  1. 1. 重庆邮电大学 通信与信息工程学院, 重庆 400065;
    2. 移动通信技术重庆市重点实验室(重庆邮电大学), 重庆 400065
  • 通讯作者: 赖雪梅
  • 作者简介:赖雪梅(1995-),女,四川广安人,硕士研究生,主要研究方向:机器学习、情感分析;唐宏(1967-),男,四川南充人,教授,博士,主要研究方向:计算机网络、移动通信;陈虹羽(1995-),男,重庆人,硕士研究生,主要研究方向:机器学习、数据挖掘;李珊珊(1993-),女,河南南阳人,硕士研究生,主要研究方向:机器学习、网络安全。
  • 基金资助:

Abstract: Aiming at the problem that the cross-modality interaction and the impact of the contribution of each modality on the final sentiment classification results are not considered in multimodal sentiment analysis of video, a multimodal sentiment analysis model of Attention Mechanism based feature Fusion-Bidirectional Gated Recurrent Unit (AMF-BiGRU) was proposed. Firstly, Bidirectional Gated Recurrent Unit (BiGRU) was used to consider the interdependence between utterances in each modality and obtain the internal information of each modality. Secondly, through the cross-modality attention interaction network layer, the internal information of the modalities were combined with the interaction between modalities. Thirdly, an attention mechanism was introduced to determine the attention weight of each modality, and the features of the modalities were effectively fused together. Finally, the sentiment classification results were obtained through the fully connected layer and softmax layer. Experiments were conducted on open CMU-MOSI (CMU Multimodal Opinion-level Sentiment Intensity) and CMU-MOSEI (CMU Multimodal Opinion Sentiment and Emotion Intensity) datasets. The experimental results show that compared with traditional multimodal sentiment analysis methods (such as Multi-Attention Recurrent Network (MARN)), the AMF-BiGRU model has the accuracy and F1-Score on CMU-MOSI dataset improved by 6.01% and 6.52% respectively, and the accuracy and F1-Score on CMU-MOSEI dataset improved by 2.72% and 2.30% respectively. AMF-BiGRU model can effectively improve the performance of multimodal sentiment classification.

Key words: multimodal, sentiment analysis, Bidirectional Gated Recurrent Unit (BiGRU), attention mechanism, feature fusion

摘要: 针对视频多模态情感分析中,未考虑跨模态的交互作用以及各模态贡献程度对最后情感分类结果的影响的问题,提出一种基于注意力机制的特征融合-双向门控循环单元多模态情感分析模型(AMF-BiGRU)。首先,利用双向门控循环单元(BiGRU)来考虑各模态中话语间的相互依赖关系,并得到各模态的内部信息;其次,通过跨模态注意力交互网络层将模态内部信息与模态之间的交互作用相结合;然后,引入注意力机制来确定各模态的注意力权重,并将各模态特征进行有效融合;最后,通过全连接层和softmax层获取情感分类结果。在公开的CMU-MOSI和CMU-MOSEI数据集上进行实验。实验结果表明,与传统的多模态情感分析方法(如多注意力循环网络(MARN))相比,AMF-BiGRU模型在CMU-MOSI数据集上的准确率和F1值分别提升了6.01%和6.52%,在CMU-MOSEI数据集上的准确率和F1值分别提升了2.72%和2.30%。可见,AMF-BiGRU模型能够有效提高多模态的情感分类性能。

关键词: 多模态, 情感分析, 双向门控循环单元, 注意力机制, 特征融合

CLC Number: