《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (3): 750-756.DOI: 10.11772/j.issn.1001-9081.2021040807
• 2021年中国计算机学会人工智能会议(CCFAI 2021) • 上一篇
杨鼎康1,2,3, 黄帅1,2,3, 王顺利1,2,3, 翟鹏1,2,3, 李一丹1,2,3, 张立华1,2,3,4,5()
收稿日期:
2021-05-18
修回日期:
2021-07-06
接受日期:
2021-07-09
发布日期:
2021-11-09
出版日期:
2022-03-10
通讯作者:
张立华
作者简介:
杨鼎康(1996—),男,陕西城固人,博士研究生,主要研究方向:计算机视觉、多模态情绪识别、情感计算基金资助:
Dingkang YANG1,2,3, Shuai HUANG1,2,3, Shunli WANG1,2,3, Peng ZHAI1,2,3, Yidan LI1,2,3, Lihua ZHANG1,2,3,4,5()
Received:
2021-05-18
Revised:
2021-07-06
Accepted:
2021-07-09
Online:
2021-11-09
Published:
2022-03-10
Contact:
Lihua ZHANG
About author:
YANG Dingkang, born in 1996, Ph. D. candidate. His research interests include computer vision, multimodal emotion recognition, affective computing.Supported by:
摘要:
由于现实生活场景差异大,人类在不同场景中表现的情感也不尽相同,导致获取到的情感数据集标签分布不均衡;同时传统方法多采用模型预训练和特征工程来增强与表情相关特征的表示能力,但没有考虑不同特征表达之间的互补性,限制了模型的泛化性和鲁棒性。针对上述问题,提出了一种包含网络集成模型Ens-Net的端到端深度学习框架EE-GAN:一方面考虑了多个异质网络获得的不同深度和区域的特征,实现不同语义、不同层次的特征融合,并通过网络集成以提高模型的学习能力;另一方面,基于对抗生成网络生成具有特定表情标签的面部图像,在进行数据增强的同时,达到平衡表情标签数据分布的目的。在CK+、FER2013和JAFFE数据集上的定性和定量实验验证了所提方法的有效性:相较于局部保留投影方法(LPP)在内的基于视图学习的方法,EE-GAN面部表情识别的准确率最高,分别达到了82.1%、84.8%和91.5%;同时,和AlexNet、VGG、ResNet等传统卷积神经网络(CNN)模型相比,准确率最少提高了9个百分点。
中图分类号:
杨鼎康, 黄帅, 王顺利, 翟鹏, 李一丹, 张立华. 基于生成对抗网络和网络集成的面部表情识别方法EE-GAN[J]. 计算机应用, 2022, 42(3): 750-756.
Dingkang YANG, Shuai HUANG, Shunli WANG, Peng ZHAI, Yidan LI, Lihua ZHANG. EE-GAN:facial expression recognition method based on generative adversarial network and network integration[J]. Journal of Computer Applications, 2022, 42(3): 750-756.
数据集 | Angry | Disgust | Fear | Happy | Neutral | Sadness | Surprise | Contempt |
---|---|---|---|---|---|---|---|---|
GAN+Basic | 800 | 653 | 750 | 800 | 740 | 731 | 780 | 346 |
FER2013 | 3 995 | 56 | 496 | 895 | 653 | 415 | 607 | — |
CK+ | 135 | 177 | 75 | 207 | — | 84 | 249 | 54 |
JAFFE | — | 29 | 31 | 30 | 30 | 30 | 30 | — |
表 1 FER2013、CK+、JAFFE数据集以及通过GAN整合后的不同表情图像的数量
Tab. 1 Numbers of different expressions’s images on FER2013,CK+, JAFFE and integrated datasets
数据集 | Angry | Disgust | Fear | Happy | Neutral | Sadness | Surprise | Contempt |
---|---|---|---|---|---|---|---|---|
GAN+Basic | 800 | 653 | 750 | 800 | 740 | 731 | 780 | 346 |
FER2013 | 3 995 | 56 | 496 | 895 | 653 | 415 | 607 | — |
CK+ | 135 | 177 | 75 | 207 | — | 84 | 249 | 54 |
JAFFE | — | 29 | 31 | 30 | 30 | 30 | 30 | — |
模型 | FER2013 | CK+ | JAFFE |
---|---|---|---|
LPP | 0.752 | 0.760 | 0.798 |
D-GPLVM | 0.779 | 0.797 | 0.850 |
GPLRF | 0.793 | 0.829 | 0.874 |
GMLDA | 0.817 | 0.834 | 0.882 |
AlexNet | 0.536 | 0.557 | 0.665 |
VGG13 | 0.621 | 0.594 | 0.708 |
VGG16 | 0.653 | 0.674 | 0.726 |
ResNet18 | 0.648 | 0.665 | 0.730 |
ResNet34 | 0.674 | 0.673 | 0.744 |
ResNet18* | 0.695 | 0.691 | 0.738 |
ResNet34* | 0.736 | 0.748 | 0.756 |
EE⁃GAN | 0.821 | 0.848 | 0.915 |
表2 不同网络模型在FER2013、CK+、JAFFE数据集的准确率
Tab. 2 Accuracies of different network models on FER2013, CK+, and JAFFE datasets
模型 | FER2013 | CK+ | JAFFE |
---|---|---|---|
LPP | 0.752 | 0.760 | 0.798 |
D-GPLVM | 0.779 | 0.797 | 0.850 |
GPLRF | 0.793 | 0.829 | 0.874 |
GMLDA | 0.817 | 0.834 | 0.882 |
AlexNet | 0.536 | 0.557 | 0.665 |
VGG13 | 0.621 | 0.594 | 0.708 |
VGG16 | 0.653 | 0.674 | 0.726 |
ResNet18 | 0.648 | 0.665 | 0.730 |
ResNet34 | 0.674 | 0.673 | 0.744 |
ResNet18* | 0.695 | 0.691 | 0.738 |
ResNet34* | 0.736 | 0.748 | 0.756 |
EE⁃GAN | 0.821 | 0.848 | 0.915 |
模型 | FER2013 | CK+ | JAFFE |
---|---|---|---|
VGG13+VGG16 | 0.758 | 0.769 | 0.794 |
VGG13+ResNet18 | 0.762 | 0.775 | 0.806 |
VGG16+ResNet18 | 0.765 | 0.780 | 0.812 |
Ens-Net | 0.774 | 0.783 | 0.827 |
EE-GAN | 0.821 | 0.848 | 0.915 |
表 3 FER2013、CK+以及JAFFE数据集上的消融实验结果
Tab. 3 Ablation experiment results on FER2013, CK+, and JAFFE datasets
模型 | FER2013 | CK+ | JAFFE |
---|---|---|---|
VGG13+VGG16 | 0.758 | 0.769 | 0.794 |
VGG13+ResNet18 | 0.762 | 0.775 | 0.806 |
VGG16+ResNet18 | 0.765 | 0.780 | 0.812 |
Ens-Net | 0.774 | 0.783 | 0.827 |
EE-GAN | 0.821 | 0.848 | 0.915 |
1 | DARWIN C, PRODGER P. The expression of the emotions in man and animals[M]. Oxford: Oxford University Press, 1998:245-276. |
2 | MEHRABIAN A, RUSSELL J A. An Approach to Environmental Psychology[M]. Cambridge: MIT Press, 1974:336-338. 10.1037/h0035915 |
3 | REVINA I M, EMMANUEL W R S. A survey on human face expression recognition techniques[J]. Journal of King Saud University - Computer and Information Sciences, 2020, 33(6):619-628. 10.1016/j.jksuci.2018.03.015 |
4 | WU M, SU W, CHEN L, et al. Weight-adapted convolution neural network for facial expression recognition in human-robot interaction[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019, 51(3):1473-1484. 10.1109/TSMC.2019.2897330 |
5 | MA L, CHEN W, FU X, et al. Emotional expression and micro-expression recognition in depressive patients[J]. Chinese Science Bulletin, 2018, 63(20): 2048-2056. 10.1360/n972017-01272 |
6 | SAJJAD M, NASIR M, ULLAH F U M, et al. Raspberry Pi assisted facial expression recognition framework for smart security in law-enforcement services[J]. Information Sciences, 2019, 47(9): 416-431. 10.1016/j.ins.2018.07.027 |
7 | 杨明中.基于权值融合虚拟样本的LBP特征人脸识别算法[J].信息技术与信息化,2021,28(4):86-88. 10.3969/j.issn.1672-9528.2021.04.025 |
YANG M Z. LBP feature face recognition algorithm based on weight fusion virtual samples[J].Information Technology and Informatization, 2021,28(4):86-88. 10.3969/j.issn.1672-9528.2021.04.025 | |
8 | ZHAO G, PIETIKAINEN M. Dynamic texture recognition using local binary patterns with an application to facial expressions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 915-928. 10.1109/tpami.2007.1110 |
9 | ZHI R, FLIERL M, RUAN Q, et al. Graph-preserving sparse nonnegative matrix factorization with application to facial expression recognition[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 41(1): 38-52. 10.1109/tsmcb.2010.2044788 |
10 | 宋彩风,刘伟锋,王延江.基于稀疏学习的人脸表情识别[J].山东科技大学学报(自然科学版), 2013, 32(3):28-34. 10.3969/j.issn.1672-3767.2013.03.006 |
SONG C F, LIU W F, WANG Y J. Face expression recognition based on sparse learning[J]. Journal of Shandong University of Science and Technology (Natural Science), 2013, 32(3):28-34. 10.3969/j.issn.1672-3767.2013.03.006 | |
11 | GOODFELLOW I J, ERHAN D, CARRIER P L, et al. Challenges in representation learning: a report on three machine learning contests[C]// Proceedings of the 2013 International Conference on International Conference on Neural Information Processing. Cham: Springer, 2013: 117-124. 10.1007/978-3-642-42051-1_16 |
12 | DHALL A, RAMANA MURTHY O V, GOECKE R, et al. Video and image based emotion recognition challenges in the wild: EmotiW 2015[C]// Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. New York: ACM, 2015: 423-426. 10.1145/2818346.2829994 |
13 | DHALL A, GOECKE R, JOSHI J, et al. EmotiW 2016: video and group-level emotion recognition challenges[C]// Proceedings of the 18th ACM International Conference on Multimodal Interaction. New York: ACM, 2016: 427-432. 10.1145/2993148.2997638 |
14 | DHALL A, GOECKE R, GHOSH S, et al. From individual to group-level emotion recognition: EmotiW 5.0[C]// Proceedings of the 19th ACM International Conference on Multimodal Interaction. New York: ACM, 2017: 524-528. 10.1145/3136755.3143004 |
15 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 10.1145/3065386 |
16 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2021-06-22]. . 10.5244/c.28.6 |
17 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
18 | LI S, DENG W. Deep facial expression recognition: a survey[J]. IEEE Transactions on Affective Computing, 2020, PP(99):1. 10.1109/taffc.2020.2981446 |
19 | MARECHAL C, MIKOAJEWSKI D, TYBUREK K, et al. Survey on AI-based multimodal methods for emotion detection[J]. High-Performance Modelling and Simulation for Big Data Applications, 2019, 16(1): 186-197. 10.1007/978-3-030-16272-6_11 |
20 | KUMARI J, RAJESH R, POOJA K. Facial expression recognition: a survey[C]// Proceedings of the 2nd International Symposium on Computer Vision and the Internet. Cham: Springer, 2015:486-491. 10.1016/j.procs.2015.08.011 |
21 | 朱娅妮, 杜加友. 基于多特征融合的人脸表情识别[J].杭州电子科技大学学报(自然科学版), 2009, 29(5):141-144. |
ZHU Y N, DU J Y. Facial expression recognition based on multi-feature fusion [J]. Journal of Hangzhou Dianzi University (Natural Sciences), 2009, 29(5):141-144. | |
22 | PANTIC M, VALSTAR M, RADEMAKER R, et al. Web-based database for facial expression analysis[C]// Proceedings of 2005 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2005: 5-10. |
23 | VALSTAR M, PANTIC M. Induced disgust, happiness and surprise: an addition to the MMI facial expression database[EB/OL].[2020-06-20]. . |
24 | DHALL A, GOECKE R, GEDEON T, et al. A semi-automatic method for collecting richly labelled large facial expression databases from movies[C]// Proceedings of the 16th International Conference on Multimodal Interaction. New York: ACM, 2012: 34-41. 10.1109/mmul.2012.26 |
25 | CIREGAN D, MEIER U, SCHMIDHUBER J. Multi-column deep neural networks for image classification[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 3642-3649. 10.1109/cvpr.2012.6248110 |
26 | BARGAL S A, BARSOUM E, FERRER C C, et al. Emotion recognition in the wild from videos using images[C]// Proceedings of the 18th ACM International Conference on Multimodal Interaction.New York: ACM, 2016: 433-436. 10.1145/2993148.2997627 |
27 | HAMESTER D, BARROS P, WERMTER S. Face expression recognition with a 2-channel convolutional neural network[C]// Proceedings of the 2015 International Joint Conference on Neural Networks. Piscataway: IEEE, 2015: 1-8. 10.1109/ijcnn.2015.7280539 |
28 | YANG H, CIFTCI U, YIN L. Facial expression recognition by de-expression residue learning[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2168-2177. 10.1109/cvpr.2018.00231 |
29 | CHEN J, KONRAD J, ISHWAR P. VGAN-based image representation learning for privacy-preserving facial expression recognition[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2018: 1570-1579. 10.1109/cvprw.2018.00207 |
30 | YANG H, ZHANG Z, YIN L. Identity-adaptive facial expression recognition through expression regeneration using conditional generative adversarial networks[C]// Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition. Piscataway: IEEE, 2018: 294-301. 10.1109/fg.2018.00050 |
31 | WANG K, PENG X, YANG J, et al. Suppressing uncertainties for large-scale facial expression recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6897-6906. 10.1109/cvpr42600.2020.00693 |
32 | 王建霞,陈慧萍,李佳泽,等.基于多特征融合卷积神经网络的人脸表情识别[J].河北科技大学学报,2019,40(6):540-547. 10.7535/hbkd.2019yx06012 |
WANG J X, CHEN H P, LI J Z, et al. Face expression recognition based on multi-feature fusion convolutional neural network[J]. Journal of Hebei University of Science and Technology,2019,40(6):540-547. 10.7535/hbkd.2019yx06012 | |
33 | GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]// Advances in Neural Information Processing Systems 27. Cambridge: MIT Press, 2014: 2672-2680. |
34 | RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[EB/OL].[2020-06-20]. . |
35 | LUCEY P, COHN J F, KANADE T, et al. The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression[C]// Proceedings of the 2010 Computer Vision & Pattern Recognition Workshops. Piscataway: IEEE, 2010:94-101. 10.1109/cvprw.2010.5543262 |
36 | LYONS M J, AKAMATSU S, KAMACHI M G, et al. Coding facial expressions with Gabor wavelets[C]// Proceedings of the 3rd International Conference on Automatic Face and Gesture Recognition. Piscataway: IEEE, 1998:200-205. |
37 | SAGONAS C, ANTONAKOS E, TZIMIROPOULOS G, et al. 300 faces In-The-Wild challenge: database and results[J]. Image & Vision Computing, 2016, 47(10):3-18. 10.1016/j.imavis.2016.01.002 |
38 | HE K, ZHANG X, REN S, et al. Identity mappings in deep residual networks[C]// Proceedings of the 2016 European Conference on Computer Vision. Cham: Springer, 2016:630-645. 10.1007/978-3-319-46493-0_38 |
39 | SAXENA D, CAO J. Generative Adversarial Networks (GANs): challenges, solutions, and future directions[J]. ACM Computing Surveys, 2022,54(3):Article No. 63. |
40 | LIU S, WANG T, BAU D, et al. Diverse image generation via self-conditioned GANs[C]// Proceedings of the 2020 Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 14286-14295. 10.1109/cvpr42600.2020.01429 |
41 | ZHENG Z, YANG F, TAN W, et al. Gabor feature-based face recognition using supervised locality preserving projection[J]. Signal Processing, 2017, 87(10):2473-2483. |
42 | URTASUN R, DARRELL T. Discriminative Gaussian process latent variable model for classification [C]// Proceedings of the 24th International Conference on Machine learning. New York: ACM, 2007:927-934. 10.1145/1273496.1273613 |
43 | ZHONG G, LI W J, YEUNG D Y, et al. Gaussian process latent random field [C]// Proceedings of the 24th AAAI Conference on Artificial Intelligence. Cham: Springer, 2010:679-684. 10.1609/aaai.v34i07.7000 |
44 | BISHOP C. Pattern Recognition and Machine Learning[M]. Cham: Springer, 2006: 257-261. 10.1007/978-0-387-45528-0_7 |
45 | ZHANG F, ZHANG T, MAO Q, et al. Joint pose and expression modeling for facial expression recognition[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 3359-3368. 10.1109/cvpr.2018.00354 |
[1] | 王品学, 张绍兵, 成苗, 何莲, 秦小山. 基于可变形卷积和自适应空间特征融合的硬币表面缺陷检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 638-645. |
[2] | 王润泽, 张月琴, 秦琪琦, 张泽华, 郭旭敏. 多视角多注意力融合分子特征的药物-靶标亲和力预测[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 325-332. |
[3] | 管其杰, 张挺, 李德亚, 周绍景, 杜奕. 基于多分辨率生成对抗网络的空间数据不确定性重建方法[J]. 计算机应用, 2021, 41(8): 2306-2311. |
[4] | 孙潇, 徐金东. 基于级联生成对抗网络的遥感图像去雾方法[J]. 计算机应用, 2021, 41(8): 2440-2444. |
[5] | 周险兵, 樊小超, 任鸽, 杨勇. 基于多层次语义特征的英文作文自动评分方法[J]. 计算机应用, 2021, 41(8): 2205-2211. |
[6] | 王伟, 赵尔平, 崔志远, 孙浩. 基于HowNet义原和Word2vec词向量表示的多特征融合消歧方法[J]. 计算机应用, 2021, 41(8): 2193-2198. |
[7] | 汤桂花, 孙磊, 毛秀青, 戴乐育, 胡永进. 基于深度对齐网络的生成对抗网络伪造人脸检测[J]. 计算机应用, 2021, 41(7): 1922-1927. |
[8] | 吴丽丹, 薛雨阳, 童同, 杜民, 高钦泉. 基于前景语义信息的图像着色算法[J]. 计算机应用, 2021, 41(7): 2048-2053. |
[9] | 杜炎, 吕良福, 焦一辰. 基于模糊推理的模糊原型网络[J]. 计算机应用, 2021, 41(7): 1885-1890. |
[10] | 王先武, 张挺, 吉欣, 杜奕. 基于带梯度惩罚深度卷积生成对抗网络的页岩三维数字岩心重构方法[J]. 计算机应用, 2021, 41(6): 1805-1811. |
[11] | 李衍志, 范勇, 高琳. 基于形态流的石油钻井水流异常检测[J]. 计算机应用, 2021, 41(6): 1842-1848. |
[12] | 井贝贝, 郭嘉, 王丽清, 陈静, 丁洪伟. 结合降噪卷积神经网络和条件生成对抗网络的图像双重盲降噪算法[J]. 计算机应用, 2021, 41(6): 1767-1774. |
[13] | 章荪, 尹春勇. 基于多任务学习的时序多模态情感分析模型[J]. 计算机应用, 2021, 41(6): 1631-1639. |
[14] | 孙鹤立, 孙玉柱, 张晓云. 基于生成对抗网络的事件描述生成[J]. 计算机应用, 2021, 41(5): 1256-1261. |
[15] | 郭茂祖, 杨倩楠, 赵玲玲. 基于条件Wassertein生成对抗网络的图像生成[J]. 计算机应用, 2021, 41(5): 1432-1437. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||