《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (2): 369-376.DOI: 10.11772/j.issn.1001-9081.2023020185
所属专题: 人工智能
陈田1,2,3(), 蔡从虎1,2,3, 袁晓辉1,4, 罗蓓蓓1,2,3
收稿日期:
2023-02-27
修回日期:
2023-04-02
接受日期:
2023-04-07
发布日期:
2024-02-22
出版日期:
2024-02-10
通讯作者:
陈田
作者简介:
蔡从虎(1998—),男,安徽宿州人,硕士研究生,主要研究方向:情感计算、人工智能、可测试性设计、低功耗测试基金资助:
Tian CHEN1,2,3(), Conghu CAI1,2,3, Xiaohui YUAN1,4, Beibei LUO1,2,3
Received:
2023-02-27
Revised:
2023-04-02
Accepted:
2023-04-07
Online:
2024-02-22
Published:
2024-02-10
Contact:
Tian CHEN
About author:
CAI Conghu, born in 1998, M. S. candidate. His research interests include affective computing, artificial intelligence, design for testability, low-power test.Supported by:
摘要:
基于生理信号的情感识别受噪声等因素影响,存在准确率低和跨个体泛化能力弱的问题。对此,提出一种基于脑电(EEG)、心电(ECG)和眼动信号的多模态情感识别方法。首先,对生理信号进行多尺度卷积,获取更高维度的信号特征并减少参数量;其次,在多模态信号特征的融合中使用自注意力机制,以提升关键特征的权重并减少模态之间的特征干扰;最后,使用双向长短期记忆(Bi-LSTM)网络提取融合特征的时序信息并进行分类。实验结果表明,所提方法在效价、唤醒度和效价/唤醒度四分类任务上分别取得90.29%、91.38%和83.53%的识别准确率,相较于脑电单模态和脑电/心电双模态方法,准确率上提升了3.46~7.11和0.92~3.15个百分点。所提方法能够准确识别情感,在个体间的识别稳定性更好。
中图分类号:
陈田, 蔡从虎, 袁晓辉, 罗蓓蓓. 基于多尺度卷积和自注意力特征融合的多模态情感识别方法[J]. 计算机应用, 2024, 44(2): 369-376.
Tian CHEN, Conghu CAI, Xiaohui YUAN, Beibei LUO. Multimodal emotion recognition method based on multiscale convolution and self-attention feature fusion[J]. Journal of Computer Applications, 2024, 44(2): 369-376.
频带 | 频率范围/Hz | 大脑活动状态 |
---|---|---|
δ | [0.5,4) | 和深度睡眠有关活动 |
θ | [4,8) | 睡意、冥想状态 |
α | [8,16) | 清醒时闭眼、处于放松状态 |
β | [16,32) | 注意力集中、情绪波动 |
γ | [32,45) | 激动、亢奋或强烈的情绪状态 |
表1 EEG不同频带对应的大脑活动
Tab. 1 Brain activities corresponding to different frequency bands of EEG
频带 | 频率范围/Hz | 大脑活动状态 |
---|---|---|
δ | [0.5,4) | 和深度睡眠有关活动 |
θ | [4,8) | 睡意、冥想状态 |
α | [8,16) | 清醒时闭眼、处于放松状态 |
β | [16,32) | 注意力集中、情绪波动 |
γ | [32,45) | 激动、亢奋或强烈的情绪状态 |
提取方法 | Valence | Arousal | ||
---|---|---|---|---|
ACC | STD | ACC | STD | |
SVM | 52.28 | 12.11 | 53.92 | 11.74 |
CNN | 62.84 | 10.24 | 65.97 | 9.98 |
1D⁃Inception | 81.26 | 8.77 | 83.97 | 7.91 |
表2 1D-Inception与其他特征提取方法的准确率对比 (%)
Tab. 2 Accuracy comparison of 1D-Inception with other feature extraction methods
提取方法 | Valence | Arousal | ||
---|---|---|---|---|
ACC | STD | ACC | STD | |
SVM | 52.28 | 12.11 | 53.92 | 11.74 |
CNN | 62.84 | 10.24 | 65.97 | 9.98 |
1D⁃Inception | 81.26 | 8.77 | 83.97 | 7.91 |
长度 | Valence | Arousal | ||
---|---|---|---|---|
ACC | STD | ACC | STD | |
1 | 76.48 | 8.39 | 73.44 | 7.61 |
3 | 80.00 | 6.27 | 65.97 | 9.90 |
6 | 90.29 | 6.28 | 91.38 | 6.02 |
10 | 89.92 | 6.49 | 90.08 | 6.07 |
15 | 85.12 | 6.11 | 88.30 | 5.98 |
表3 不同的Bi-LSTM序列长度的实验结果对比 (%)
Tab. 3 Comparison of experimental results with different sequence lengths of Bi-LSTM
长度 | Valence | Arousal | ||
---|---|---|---|---|
ACC | STD | ACC | STD | |
1 | 76.48 | 8.39 | 73.44 | 7.61 |
3 | 80.00 | 6.27 | 65.97 | 9.90 |
6 | 90.29 | 6.28 | 91.38 | 6.02 |
10 | 89.92 | 6.49 | 90.08 | 6.07 |
15 | 85.12 | 6.11 | 88.30 | 5.98 |
多模态 融合方法 | Valence | Arousal | V/A四分类 | |||
---|---|---|---|---|---|---|
ACC | STD | ACC | STD | ACC | STD | |
直接融合 | 84.77 | 7.76 | 86.92 | 7.74 | 75.26 | 13.42 |
决策层融合 | 85.26 | 7.21 | 88.30 | 6.98 | 79.55 | 10.29 |
自注意力融合 | 90.29 | 6.28 | 91.38 | 6.02 | 83.53 | 9.77 |
表4 自注意力融合方法和其他融合方法的准确率对比 (%)
Tab. 4 Accuracy comparison between self-attention-based fusion method and other fusion methods
多模态 融合方法 | Valence | Arousal | V/A四分类 | |||
---|---|---|---|---|---|---|
ACC | STD | ACC | STD | ACC | STD | |
直接融合 | 84.77 | 7.76 | 86.92 | 7.74 | 75.26 | 13.42 |
决策层融合 | 85.26 | 7.21 | 88.30 | 6.98 | 79.55 | 10.29 |
自注意力融合 | 90.29 | 6.28 | 91.38 | 6.02 | 83.53 | 9.77 |
使用模态 | Valence | Arousal | V/A四分类 | |||
---|---|---|---|---|---|---|
ACC | STD | ACC | STD | ACC | STD | |
EEG | 84.17 | 7.40 | 87.92 | 7.23 | 76.42 | 10.47 |
ECG | 77.84 | 10.24 | 69.97 | 9.98 | 45.39 | 11.20 |
眼动信号 | 65.20 | 12.01 | 70.13 | 11.89 | 39.28 | 13.09 |
EEG+ECG双模态 | 89.37 | 6.97 | 88.23 | 6.73 | 82.26 | 9.97 |
EEG+眼动双模态 | 84.79 | 7.82 | 86.30 | 7.54 | 78.10 | 10.59 |
三模态(本文方法) | 90.29 | 6.28 | 91.38 | 6.02 | 83.53 | 9.77 |
表5 多模态方法与单、双模态方法的准确率对比 (%)
Tab. 5 Accuracy comparison between multimodal method with unimodal and bimodal methods
使用模态 | Valence | Arousal | V/A四分类 | |||
---|---|---|---|---|---|---|
ACC | STD | ACC | STD | ACC | STD | |
EEG | 84.17 | 7.40 | 87.92 | 7.23 | 76.42 | 10.47 |
ECG | 77.84 | 10.24 | 69.97 | 9.98 | 45.39 | 11.20 |
眼动信号 | 65.20 | 12.01 | 70.13 | 11.89 | 39.28 | 13.09 |
EEG+ECG双模态 | 89.37 | 6.97 | 88.23 | 6.73 | 82.26 | 9.97 |
EEG+眼动双模态 | 84.79 | 7.82 | 86.30 | 7.54 | 78.10 | 10.59 |
三模态(本文方法) | 90.29 | 6.28 | 91.38 | 6.02 | 83.53 | 9.77 |
方法 | 准确率 | 方法 | 准确率 | ||
---|---|---|---|---|---|
Valence | Arousal | Valence | Arousal | ||
文献[ | 76.56 | 80.46 | 文献[ | 86.61 | 85.34 |
文献[ | 84.00 | 72.00 | 文献[ | 91.82 | 88.24 |
文献[ | 85.38 | 77.52 | 本文方法 | 90.29 | 91.38 |
表6 与现存的基于生理信号情感识别方法的准确率对比 (%)
Tab. 6 Accuracy comparison with existing physiological signal-based emotion recognition methods
方法 | 准确率 | 方法 | 准确率 | ||
---|---|---|---|---|---|
Valence | Arousal | Valence | Arousal | ||
文献[ | 76.56 | 80.46 | 文献[ | 86.61 | 85.34 |
文献[ | 84.00 | 72.00 | 文献[ | 91.82 | 88.24 |
文献[ | 85.38 | 77.52 | 本文方法 | 90.29 | 91.38 |
1 | JENKE R, PEER A, BUSS M. Feature extraction and selection for emotion recognition from EEG [J]. IEEE Transactions on Affective Computing, 2014, 5(3): 327-339. 10.1109/taffc.2014.2339834 |
2 | KRUMPAL I. Determinants of social desirability bias in sensitive surveys: A literature review [J]. Quality & Quantity, 2013, 47: 2025-2047. 10.1007/s11135-011-9640-9 |
3 | BOULAY B. Towards a motivationally intelligent pedagogy: How should an intelligent tutor respond to the unmotivated or the demotivated? [C]// New Perspectives on Affect and Learning Technologies. New York: Springer, 2011, 3: 41-52. 10.1007/978-1-4419-9625-1_4 |
4 | ZHAO G, GE Y, SHEN B, et al. Emotion analysis for personality inference from EEG signals [J]. IEEE Transactions on Affective Computing, 2018, 9(3): 362-371. 10.1109/taffc.2017.2786207 |
5 | CHAO H, DONG L. Emotion recognition using three-dimensional feature and convolutional neural network from multichannel EEG signals [J]. IEEE Sensors Journal, 2021, 21(2): 2024-2034. 10.1109/jsen.2020.3020828 |
6 | SONG T, ZHENG W, SONG P, et al. EEG emotion recognition using dynamical graph convolutional neural networks [J]. IEEE Transactions on Affective Computing, 2020, 11(3): 532-541. 10.1109/taffc.2018.2817622 |
7 | CHEN T, YIN H, YUAN X, et al. Emotion recognition based on fusion of long short-term memory networks and SVMs [J]. Digital Signal Processing, 2021, 117: 103153. 10.1016/j.dsp.2021.103153 |
8 | KATSIGIANNIS S, RAMZAN N. DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices [J]. IEEE Journal of Biomedical and Health Informatics, 2017, 22(1): 98-107. 10.1109/jbhi.2017.2688239 |
9 | CHEN T, JU S, REN F, et al. EEG emotion recognition model based on the LIBSVM classifier [J]. Measurement, 2020, 164: 108047. 10.1016/j.measurement.2020.108047 |
10 | 陈田,樊明焱,任福继,等.采用瞳孔位置实现情感识别的方案[J].计算机应用研究,2021,38(6):1765-1769. 10.19734/j.issn.1001-3695.2020.06.0165 |
CHEN T, FAN M Y, REN F J, et al. Emotion recognition using pupil position [J]. Application Research of Computers, 2021, 38(6): 1765-1769. 10.19734/j.issn.1001-3695.2020.06.0165 | |
11 | ALARCÃO S M, FRONSECA M J. Emotions recognition using EEG signals: a survey [J]. IEEE Transactions on Affective Computing, 2017, 10(3): 374-393. |
12 | SINGSON L N B, SANCHEZ M T U R, VILLAVERDE J F. Emotion recognition using short-term analysis of heart rate variability and ResNet architecture [C]// Proceedings of the 2021 13th International Conference on Computer and Automation Engineering. Piscataway: IEEE, 2021: 15-18. 10.1109/iccae51876.2021.9426094 |
13 | CHEN J X, ZHANG P W, MAO Z J, et al. Accurate EEG-based emotion recognition on combined features using deep convolutional neural networks [J]. IEEE Access, 2019, 7: 44317-44328. 10.1109/access.2019.2908285 |
14 | KOELSTRA S, MUHL C, SOLEYMANI M, et al. DEAP: a database for emotion analysis; using physiological signals [J]. IEEE Transactions on Affective Computing, 2011, 3(1): 18-31. 10.1109/t-affc.2011.15 |
15 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
16 | SANTAMARÍA-VÁZQUEZ E, MARTÍNEZ-CAGIGAL V, VAQUERIZO-VILLAR F, et al. EEG-inception: a novel deep convolutional neural network for assistive ERP-based brain-computer interfaces [J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020, 28(12): 2773-2782. 10.1109/tnsre.2020.3048106 |
17 | LEE D-Y, J-H JEONG, K-H SHIM, et al. Classification of upper limb movements using convolutional neural network with 3D inception block [C]// Proceedings of the 2020 8th International Winter Conference on Brain-Computer Interface. Piscataway: IEEE, 2020: 1-5. 10.1109/bci48061.2020.9061671 |
18 | Y-H KWON, S-B SHIN, KIM S-D. Electroencephalography based fusion Two-Dimensional (2D)-Convolution Neural Networks (CNN) model for emotion recognition system [J]. Sensors, 2018, 18(5): 1383. 10.3390/s18051383 |
19 | TAO W, LI C, SONG R, et al. EEG-based emotion recognition via channel-wise attention and self attention [J]. IEEE Transactions on Affective Computing, 2023, 14(1): 382-393. 10.1109/taffc.2020.3025777 |
20 | DZEDZICKIS A, KAKLAUSKAS A, BUCINSKAS V. Human emotion recognition: Review of sensors and methods [J]. Sensors, 2020, 20(3): 592. 10.3390/s20030592 |
21 | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 448-456. |
22 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [EB/OL]. (2017-12-06)[2023-02-01]. . |
23 | DALY I, WILLIAMS D, MALIK A, et al. Personalised, multi-modal, affective state detection for hybrid brain-computer music interfacing [J]. IEEE Transactions on Affective Computing, 2018, 11(1): 111-124. |
24 | DONOHO D L, JOHNSTONE I M. Ideal spatial adaptation by wavelet shrinkage [J]. Biometrika, 1994, 81(3): 425-455. 10.1093/biomet/81.3.425 |
25 | DIKKER S, WAN L, DAVIDESCO I, et al. Brain-to-brain synchrony tracks real-world dynamic group interactions in the classroom [J]. Current Biology, 2017, 27(9): 1375-1380. 10.1016/j.cub.2017.04.002 |
26 | LI W, CHU M, QIAO J. Design of a hierarchy modular neural network and its application in multimodal emotion recognition [J]. Soft Computing, 2019, 23: 11817-11828. 10.1007/s00500-018-03735-0 |
27 | WU X, ZHENG W-L, LI Z, et al. Investigating EEG-based functional connectivity patterns for multimodal emotion recognition [J]. Journal of Neural Engineering, 2022, 19: 016012. 10.1088/1741-2552/ac49a7 |
28 | 李路宝,陈田,任福继,等.基于图神经网络和注意力的双模态情感识别方法[J].计算机应用,2023,43(3):700-705. 10.11772/j.issn.1001-9081.2022020216 |
LI L B, CHEN T, REN F J, et al. Bimodal emotion recognition method based on graph neural network and attention [J]. Journal of Computer Applications, 2023, 43(3): 700-705. 10.11772/j.issn.1001-9081.2022020216 | |
29 | PEREIRA E T, GOMES H M, VELOSO L R, et al. Empirical evidence relating EEG signal duration to emotion classification performance [J]. IEEE Transactions on Affective Computing, 2021, 12(1): 154-164. 10.1109/taffc.2018.2854168 |
[1] | 李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703. |
[2] | 张睿, 张鹏云, 高美蓉. 自优化双模态多通路非深度前庭神经鞘瘤识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2975-2982. |
[3] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[4] | 王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918. |
[5] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[6] | 黄颖, 杨佳宇, 金家昊, 万邦睿. 用于RGBT跟踪的孪生混合信息融合算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2878-2885. |
[7] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[8] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[9] | 杨帆, 邹窈, 朱明志, 马振伟, 程大伟, 蒋昌俊. 基于图注意力Transformer神经网络的信用卡欺诈检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2634-2642. |
[10] | 晁浩, 封舒琪, 刘永利. 脑电情感识别中多上下文向量优化的卷积递归神经网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2041-2046. |
[11] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[12] | 徐泽鑫, 杨磊, 李康顺. 较短的长序列时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1824-1831. |
[13] | 时旺军, 王晶, 宁晓军, 林友芳. 小样本场景下的元迁移学习睡眠分期模型[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1445-1451. |
[14] | 周菊香, 刘金生, 甘健侯, 吴迪, 李子杰. 基于多尺度时序感知网络的课堂语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1636-1643. |
[15] | 刘子涵, 周登文, 刘玉铠. 基于全局依赖Transformer的图像超分辨率网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1588-1596. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||