计算机应用 ›› 2021, Vol. 41 ›› Issue (6): 1631-1639.DOI: 10.11772/j.issn.1001-9081.2020091416

所属专题: 人工智能

• 人工智能 • 上一篇    下一篇

基于多任务学习的时序多模态情感分析模型

章荪, 尹春勇   

  1. 南京信息工程大学 计算机与软件学院, 南京 210044
  • 收稿日期:2020-09-14 修回日期:2020-10-24 出版日期:2021-06-10 发布日期:2020-12-21
  • 通讯作者: 尹春勇
  • 作者简介:章荪(1994-),男,安徽六安人,博士研究生,主要研究方向:深度学习、情感分析、文本分类;尹春勇(1977-),男,山东潍坊人,教授,博士生导师,博士,主要研究方向:网络空间安全、大数据挖掘、隐私保护、人工智能、新型计算。
  • 基金资助:
    国家自然科学基金面上项目(61772282)。

Sequential multimodal sentiment analysis model based on multi-task learning

ZHANG Sun, YIN Chunyong   

  1. School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing Jiangsu 210044, China
  • Received:2020-09-14 Revised:2020-10-24 Online:2021-06-10 Published:2020-12-21
  • Supported by:
    This work is partially supported by the Surface Program of National Natural Science Foundation of China (61772282).

摘要: 针对时序多模态情感分析中存在的单模态特征表示和跨模态特征融合问题,结合多头注意力机制,提出一种基于多任务学习的情感分析模型。首先,使用卷积神经网络(CNN)、双向门控循环神经网络(BiGRU)和多头自注意力(MHSA)实现了对时序单模态的特征表示;然后,利用多头注意力实现跨模态的双向信息融合;最后,基于多任务学习思想,添加额外的情感极性分类和情感强度回归任务作为辅助,从而提升情感评分回归主任务的综合性能。实验结果表明,相较于多模态分解模型,所提模型的二分类准确度指标在CMU-MOSEI和CMU-MOSI多模态数据集上分别提高了7.8个百分点和3.1个百分点。该模型适用于多模态场景下的情感分析问题,能够为商品推荐、股市预测、舆情监控等应用提供决策支持。

关键词: 情感分析, 多模态, 多任务学习, 序列学习, 特征融合

Abstract: Considering the issues of unimodal feature representation and cross-modal feature fusion in sequential multimodal sentiment analysis, a multi-task learning based sentiment analysis model was proposed by combining with multi-head attention mechanism. Firstly, Convolution Neural Network (CNN), Bidirectional Gated Recurrent Unit (BiGRU) and Multi-Head Self-Attention (MHSA) were used to realize the sequential unimodal feature representation. Secondly, the bidirectional cross-modal information was fused by multi-head attention. Finally, based on multi-task learning, the sentiment polarity classification and sentiment intensity regression were added as auxiliary tasks to improve the comprehensive performance of the main task of sentiment score regression. Experimental results demonstrate that the proposed model improves the accuracy of binary classification by 7.8 percentage points and 3.1 percentage points respectively on CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) and CMU Multimodal Opinion level Sentiment Intensity (CMU-MOSI) datasets compared with multimodal factorization model. Therefore, the proposed model is applicable for the sentiment analysis problems under multimodal scenarios, and can provide the decision supports for product recommendation, stock market forecasting, public opinion monitoring and other relevant applications.

Key words: sentiment analysis, multimodal, multi-task learning, sequential learning, feature fusion

中图分类号: