《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (2): 369-376.DOI: 10.11772/j.issn.1001-9081.2023020185

• 人工智能 • 上一篇    

基于多尺度卷积和自注意力特征融合的多模态情感识别方法

陈田1,2,3(), 蔡从虎1,2,3, 袁晓辉1,4, 罗蓓蓓1,2,3   

  1. 1.合肥工业大学 计算机与信息学院, 合肥 230009
    2.智能互联系统安徽省实验室, 合肥 230009
    3.情感计算与先进智能机器安徽省重点实验室, 合肥 230009
    4.北德克萨斯大学 计算机科学与工程系, 丹顿 76207
  • 收稿日期:2023-02-27 修回日期:2023-04-02 接受日期:2023-04-07 发布日期:2024-02-22 出版日期:2024-02-10
  • 通讯作者: 陈田
  • 作者简介:蔡从虎(1998—),男,安徽宿州人,硕士研究生,主要研究方向:情感计算、人工智能、可测试性设计、低功耗测试
    袁晓辉(1973—),男,安徽合肥人,教授,博士,主要研究方向:计算机视觉、人工智能、数据挖掘、机器学习
    罗蓓蓓(1999—),女,安徽合肥人,硕士研究生,主要研究方向:情感计算、人工智能、可测试性设计、低功耗测试。
  • 基金资助:
    国家自然科学基金资助项目(62174048)

Multimodal emotion recognition method based on multiscale convolution and self-attention feature fusion

Tian CHEN1,2,3(), Conghu CAI1,2,3, Xiaohui YUAN1,4, Beibei LUO1,2,3   

  1. 1.School of Computer Science and Information Engineering,Hefei University of Technology,Hefei Anhui 230009,China
    2.Intelligent Interconnected Systems Laboratory of Anhui Province,Hefei Anhui 230009,China
    3.Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine,Hefei Anhui 230009,China
    4.Department of Computer Science and Engineering,University of North Texas,Denton Texas 76207,USA
  • Received:2023-02-27 Revised:2023-04-02 Accepted:2023-04-07 Online:2024-02-22 Published:2024-02-10
  • Contact: Tian CHEN
  • About author:CAI Conghu, born in 1998, M. S. candidate. His research interests include affective computing, artificial intelligence, design for testability, low-power test.
    YUAN Xiaohui, born in 1973, Ph. D., professor. His research interests include computer vision, artificial intelligence, data mining, machine learning.
    LUO Beibei, born in 1999, M. S. candidate. Her research interests include affective computing, artificial intelligence, design for testability, low-power test.
  • Supported by:
    National Natural Science Foundation of China(62174048)

摘要:

基于生理信号的情感识别受噪声等因素影响,存在准确率低和跨个体泛化能力弱的问题。对此,提出一种基于脑电(EEG)、心电(ECG)和眼动信号的多模态情感识别方法。首先,对生理信号进行多尺度卷积,获取更高维度的信号特征并减少参数量;其次,在多模态信号特征的融合中使用自注意力机制,以提升关键特征的权重并减少模态之间的特征干扰;最后,使用双向长短期记忆(Bi-LSTM)网络提取融合特征的时序信息并进行分类。实验结果表明,所提方法在效价、唤醒度和效价/唤醒度四分类任务上分别取得90.29%、91.38%和83.53%的识别准确率,相较于脑电单模态和脑电/心电双模态方法,准确率上提升了3.46~7.11和0.92~3.15个百分点。所提方法能够准确识别情感,在个体间的识别稳定性更好。

关键词: 脑电, 自注意力, 心电, 眼动, 多模态, 情感识别

Abstract:

Emotion recognition based on physiological signals is affected by noise and other factors, resulting in low accuracy and weak cross-individual generalization ability. Concerning the issue, a multimodal emotion recognition method based on ElectroEncephaloGram (EEG), ElectroCardioGram (ECG), and eye movement signals was proposed. Firstly, physiological signals were performed multi-scale convolution to obtain higher-dimensional signal features and reduce parameter size. Secondly, self-attention was employed in the fusion of multimodal signal features to enhance the weights of key features and reduce feature interference between modalities. Finally, a Bi-directional Long Short-Term Memory (Bi-LSTM) network was used for extraction of temporal information of fused features and classification. Experimental results show that, the proposed method achieves recognition accuracies of 90.29%, 91.38%, and 83.53% for valence, arousal, and valence/arousal four-class recognition tasks, respectively, with improvements of 3.46-7.11 and 0.92-3.15 percentage points compared to the EEG single-modality and EEG+ECG bimodal methods. The proposed method can accurately recognize emotion with better recognition stability between individuals.

Key words: ElectroEncephaloGram (EEG), self-attention, ElectroCardioGram (ECG), eye movement, multimodal, emotion recognition

中图分类号: