1 |
LI D, LIU J, YANG Z, et al. Speech emotion recognition using recurrent neural networks with directional self-attention[J]. Expert Systems with Applications, 2021, 173: 114683.
|
2 |
张雪英,孙颖,张卫,等. 语音情感识别的关键技术[J]. 太原理工大学学报, 2015, 46(6): 629-636.
|
|
ZHANG X Y, SUN Y, ZHANG W, et al. Key technologies for speech emotion recognition [J]. Journal of Taiyuan University of Technology, 2015, 46(6): 629-636.
|
3 |
SWAIN M, ROUTRAY A, KABISATPATHY P. Databases, features and classifiers for speech emotion recognition: a review[J]. International Journal of Speech Technology, 2018, 21: 93-120.
|
4 |
ZHANG S, ZHAO X, TIAN Q. Spontaneous speech emotion recognition using multiscale deep convolutional LSTM[J]. IEEE Transactions on Affective Computing, 2022, 13(2): 680-688.
|
5 |
MIRSAMADI S, BARSOUM E, ZHANG C. Automatic speech emotion recognition using recurrent neural networks with local attention [C] // Proceedings of the 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2017: 2227-2231.
|
6 |
WANG H, CUI Z, CHEN Y, et al. Predicting hospital readmission via cost-sensitive deep learning[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2018,15(6): 1968-1978.
|
7 |
刘建兴,蔡国永,吕光瑞,等. 基于深度双向长短时记忆网络的文本情感分类[J]. 桂林电子科技大学学报,2018, 38(2): 122-126.
|
|
LIU J X, CAI G Y, LV G R, et al. Text sentiment classification based on deep bidirectional long-short-term memory network[J]. Journal of Guilin University of Electronic Technology, 2018, 38(2): 122-126.
|
8 |
PENG Z, LU Y, PAN S, et al. Efficient speech emotion recognition using multi-scale CNN and attention [C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2021: 3020-3024.
|
9 |
XIE Y, LIANG R Y, LIANG Z L, et al. Speech emotion classification using attention based LSTM [J]. IEEE/ACM Transactions. on Audio, Speech, and Language Processing, 2019, 27(11): 1675-1685.
|
10 |
VASWANI A, SHAZEER N, PARMAR N,et al .Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010.
|
11 |
戴妍妍,金赟,马勇,等. 基于高效通道注意力机制的语音情感识别方法[J]. 信号处理,2021, 37(10): 1835-1842.
|
|
DAI Y Y, JIN Y, MA Y, et al .Speech emotion recognition based on efficient channel attention[J]. Journal of Signal Processing, 2021, 37(10): 1835-1842.
|
12 |
WANG Q, WU B, ZHU P,et al . ECA- Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539.
|
13 |
BHOSALE S, CHAKRABORTY R, KOPPARAPU S K. Deep encoded linguistic and acoustic cues for attention based end to end speech emotion recognition[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 7189-7193.
|
14 |
LI R, WU Z, JIA J, et al. Dilated residual network with multi- head self-attention for speech emotion recognition[C]// Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2019: 6675-6679.
|
15 |
耿磊,傅洪亮,陶华伟,等. 基于动态卷积递归神经网络的语音情感识别[J]. 计算机工程,2023, 49(4): 125-130.
|
|
GENG L, FU H L, TAO H W, et al. Speech emotion recognition based on dynamic convolutional recurrent neural network[J]. Computer Engineering, 2023, 49(4): 125-130.
|
16 |
PENG Z C, ZHU Z, UNOKI M, et al. Auditory-inspired end-to-end speech emotion recognition using 3D convolutional recurrent neural networks based on spectral-temporal representation[C]// Proceedings of the 2018 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2018: 1-6.
|
17 |
CHEN Q, HUANG G. A novel dual attention-based BiLSTM with hybrid features in speech emotion recognition[J]. Engineering Applications of Artificial Intelligence, 2021, 102: 104277.
|
18 |
SEPAS-MOGHADDAM A, ETEMAD A, PEREIRA F, et al. Facial emotion recognition using light field images with deep attention-based bidirectional LSTM[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2020: 3367-3371.
|
19 |
GAO S-H, CHENG M-M, ZHAO K, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652-662.
|
20 |
HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13708-13717.
|
21 |
IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 448-456.
|
22 |
ULYANOV D, VEDALDI A, LEMPITSKY V. Instance normalization: the missing ingredient for fast stylization[EB/OL]. (2016-07-27)[2023-08-01]. .
|
23 |
BURKHARDT F, PAESCHKE A, ROLFES M, et al. A database of German emotional speech[C]// Proceedings of the Interspeech 2005. Baixas: International Speech Communication Association, 2005: 1517-1520.
|
24 |
BUSSO C, BULUT M, LEE C-C, et al. IEMOCAP: interactive emotional dyadic motion capture database[J]. Language Resources and Evaluation Lang, 2008, 42(4): 335-359.
|
25 |
HOU M, ZHANG Z, CAO Q, et al. Multi-view speech emotion recognition via collective relation construction[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2022, 30: 218-229.
|
26 |
ABDEL-HAMID O, A-R MOHAMED, JIANG H, et al. Convolutional neural networks for speech recognition[J]. IEEE/ACM Transactions on Audio, Speech, Language Processing, 2014, 22(10): 1533-1545.
|
27 |
ZHAO Z, LI Q, ZHANG Z, et al. Combining a parallel 2D CNN with a self-attention dilated residual network for CTC-based discrete speech emotion recognition[J]. Neural Networks, 2021, 141: 52-60.
|
28 |
LI S, XING X, FAN W, et al. Spatiotemporal and frequential cascaded attention networks for speech emotion recognition[J]. Neurocomputing, 2021, 448: 238-248.
|