EE-GAN：facial expression recognition method based on generative adversarial network and network integration

doi:10.11772/j.issn.1001-9081.2021040807

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (3): 750-756.DOI: 10.11772/j.issn.1001-9081.2021040807

Special Issue: 人工智能； 2021年中国计算机学会人工智能会议(CCFAI 2021)

• 2021 CCF Conference on Artificial Intelligence (CCFAI 2021) • Previous Articles Next Articles

EE-GAN：facial expression recognition method based on generative adversarial network and network integration

Dingkang YANG¹^,²^,³, Shuai HUANG¹^,²^,³, Shunli WANG¹^,²^,³, Peng ZHAI¹^,²^,³, Yidan LI¹^,²^,³, Lihua ZHANG¹^,²^,³^,⁴^,⁵()

^1.Academy for Engineering & Technology，Fudan University，Shanghai 200433，China
^2.Shanghai Engineering Research Center of AI & Robotics，Shanghai 200433，China
^3.Engineering Research Center of AI & Robotics，Ministry of Education，Shanghai 200433，China
^4.Ji Hua Laboratory，Foshan Guangdong 528000，China
^5.Artificial Intelligence and Unmanned Systems Engineering Research Center of Jilin Province，Changchun Jilin 130000，China

Received:2021-05-18 Revised:2021-07-06 Accepted:2021-07-09 Online:2021-11-09 Published:2022-03-10
Contact: Lihua ZHANG
About author:YANG Dingkang， born in 1996， Ph. D. candidate. His research interests include computer vision， multimodal emotion recognition， affective computing.
HUANG Shuai， born in 1998， M. S. candidate. His research interests include behavior recognition， emotion recognition.
WANG Shunli， born in 1998， Ph. D. candidate. His research interests include human action analysis， action quality assessment.
ZHAI Peng， born in 1992， Ph. D. candidate. His research interests include artificial intelligence， reinforcement learning.
LI Yidan， born in 1998， M. S. candidate. Her research interests include image processing， computational imaging.
Supported by:
National Natural Science Foundation of China(82090052);Shanghai Municipal Science and Technology Major Project(2021SHZDZX0103)

基于生成对抗网络和网络集成的面部表情识别方法EE-GAN

杨鼎康¹^,²^,³, 黄帅¹^,²^,³, 王顺利¹^,²^,³, 翟鹏¹^,²^,³, 李一丹¹^,²^,³, 张立华¹^,²^,³^,⁴^,⁵()

^1.复旦大学工程与应用技术研究院, 上海 200433
^2.上海智能机器人工程技术研究中心, 上海 200433
^3.智能机器人教育部工程研究中心, 上海 200433
^4.季华实验室, 广东佛山 528200
^5.吉林省人工智能与无人系统工程研究中心, 长春 130000

通讯作者: 张立华
作者简介:杨鼎康（1996—），男，陕西城固人，博士研究生，主要研究方向：计算机视觉、多模态情绪识别、情感计算
黄帅（1998—），男，安徽阜阳人，硕士研究生，主要研究方向：行为识别、情绪识别
王顺利（1998—），男，河北石家庄人，博士研究生，主要研究方向：人体行为分析、行为质量评估
翟鹏（1992—），男，山西阳泉人，博士研究生，主要研究方向：人工智能、强化学习
李一丹（1998—），女，山西原平人，硕士研究生，主要研究方向：图像处理、计算成像；
基金资助:
国家自然科学基金资助项目(82090052);上海市科技重大项目(2021SHZDZX0103)

Abstract

Abstract:

Because there are many differences in real life scenes， human emotions are various in different scenes， which leads to an uneven distribution of labels in the emotion dataset. Furthermore， most traditional methods utilize model pre-training and feature engineering to enhance the expression ability of expression-related features， but do not consider the complementarity between different feature representations， which limits the generalization and robustness of the model. To address these issues， EE-GAN， an end-to-end deep learning framework including the network integration model Ens-Net was proposed. It took the characteristics of different depths and regions into consideration，the fusion of different semantic and different level features was implemented， and network integration was used to improve the learning ability of the model. Besides， facial images with specific expression labels were generated by generative adversarial network， which aimed to balance the distribution of expression labels in data augmentation. The qualitative and quantitative evaluations on CK+， FER2013 and JAFFE datasets demonstrate the effectiveness of proposed method. Compared with existing view learning methods， including Locality Preserving Projections （LPP）， EE-GAN achieves the facial expression accuracies of 82.1%， 84.8% and 91.5% on the three datasets respectively. Compared with traditional CNN models such as AlexNet， VGG， and ResNet， EE-GAN achieves the accuracy increased by at least 9 percentage points.

Key words: facial expression recognition, Generative Adversarial Network (GAN), network integration, uneven label distribution, feature fusion

摘要：

由于现实生活场景差异大，人类在不同场景中表现的情感也不尽相同，导致获取到的情感数据集标签分布不均衡；同时传统方法多采用模型预训练和特征工程来增强与表情相关特征的表示能力，但没有考虑不同特征表达之间的互补性，限制了模型的泛化性和鲁棒性。针对上述问题，提出了一种包含网络集成模型Ens-Net的端到端深度学习框架EE-GAN：一方面考虑了多个异质网络获得的不同深度和区域的特征，实现不同语义、不同层次的特征融合，并通过网络集成以提高模型的学习能力；另一方面，基于对抗生成网络生成具有特定表情标签的面部图像，在进行数据增强的同时，达到平衡表情标签数据分布的目的。在CK+、FER2013和JAFFE数据集上的定性和定量实验验证了所提方法的有效性：相较于局部保留投影方法（LPP）在内的基于视图学习的方法，EE-GAN面部表情识别的准确率最高，分别达到了82.1%、84.8%和91.5%；同时，和AlexNet、VGG、ResNet等传统卷积神经网络（CNN）模型相比，准确率最少提高了9个百分点。

关键词: 面部表情识别, 生成对抗网络, 网络集成, 不均衡标签分布, 特征融合

CLC Number:

TP391.41

Dingkang YANG, Shuai HUANG, Shunli WANG, Peng ZHAI, Yidan LI, Lihua ZHANG. EE-GAN：facial expression recognition method based on generative adversarial network and network integration[J]. Journal of Computer Applications, 2022, 42(3): 750-756.

杨鼎康, 黄帅, 王顺利, 翟鹏, 李一丹, 张立华. 基于生成对抗网络和网络集成的面部表情识别方法EE-GAN[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 750-756.

Figures/Tables 7

References 45

1	DARWIN C， PRODGER P. The expression of the emotions in man and animals［M］. Oxford： Oxford University Press， 1998：245-276.
2	MEHRABIAN A， RUSSELL J A. An Approach to Environmental Psychology［M］. Cambridge： MIT Press， 1974：336-338. 10.1037/h0035915
3	REVINA I M， EMMANUEL W R S. A survey on human face expression recognition techniques［J］. Journal of King Saud University - Computer and Information Sciences， 2020， 33（6）：619-628. 10.1016/j.jksuci.2018.03.015
4	WU M， SU W， CHEN L， et al. Weight-adapted convolution neural network for facial expression recognition in human-robot interaction［J］. IEEE Transactions on Systems， Man， and Cybernetics： Systems， 2019， 51（3）：1473-1484. 10.1109/TSMC.2019.2897330
5	MA L， CHEN W， FU X， et al. Emotional expression and micro-expression recognition in depressive patients［J］. Chinese Science Bulletin， 2018， 63（20）： 2048-2056. 10.1360/n972017-01272
6	SAJJAD M， NASIR M， ULLAH F U M， et al. Raspberry Pi assisted facial expression recognition framework for smart security in law-enforcement services［J］. Information Sciences， 2019， 47（9）： 416-431. 10.1016/j.ins.2018.07.027
7	杨明中.基于权值融合虚拟样本的LBP特征人脸识别算法［J］.信息技术与信息化，2021，28（4）：86-88. 10.3969/j.issn.1672-9528.2021.04.025
	YANG M Z. LBP feature face recognition algorithm based on weight fusion virtual samples［J］.Information Technology and Informatization， 2021，28（4）：86-88. 10.3969/j.issn.1672-9528.2021.04.025
8	ZHAO G， PIETIKAINEN M. Dynamic texture recognition using local binary patterns with an application to facial expressions［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2007， 29（6）： 915-928. 10.1109/tpami.2007.1110
9	ZHI R， FLIERL M， RUAN Q， et al. Graph-preserving sparse nonnegative matrix factorization with application to facial expression recognition［J］. IEEE Transactions on Systems， Man， and Cybernetics， Part B （Cybernetics）， 2010， 41（1）： 38-52. 10.1109/tsmcb.2010.2044788
10	宋彩风，刘伟锋，王延江.基于稀疏学习的人脸表情识别［J］.山东科技大学学报（自然科学版）， 2013， 32（3）：28-34. 10.3969/j.issn.1672-3767.2013.03.006
	SONG C F， LIU W F， WANG Y J. Face expression recognition based on sparse learning［J］. Journal of Shandong University of Science and Technology （Natural Science）， 2013， 32（3）：28-34. 10.3969/j.issn.1672-3767.2013.03.006
11	GOODFELLOW I J， ERHAN D， CARRIER P L， et al. Challenges in representation learning： a report on three machine learning contests［C］// Proceedings of the 2013 International Conference on International Conference on Neural Information Processing. Cham： Springer， 2013： 117-124. 10.1007/978-3-642-42051-1_16
12	DHALL A， RAMANA MURTHY O V， GOECKE R， et al. Video and image based emotion recognition challenges in the wild： EmotiW 2015［C］// Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. New York： ACM， 2015： 423-426. 10.1145/2818346.2829994
13	DHALL A， GOECKE R， JOSHI J， et al. EmotiW 2016： video and group-level emotion recognition challenges［C］// Proceedings of the 18th ACM International Conference on Multimodal Interaction. New York： ACM， 2016： 427-432. 10.1145/2993148.2997638
14	DHALL A， GOECKE R， GHOSH S， et al. From individual to group-level emotion recognition： EmotiW 5.0［C］// Proceedings of the 19th ACM International Conference on Multimodal Interaction. New York： ACM， 2017： 524-528. 10.1145/3136755.3143004
15	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［J］. Communications of the ACM， 2017， 60（6）： 84-90. 10.1145/3065386
16	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［EB/OL］. ［2021-06-22］. . 10.5244/c.28.6
17	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
18	LI S， DENG W. Deep facial expression recognition： a survey［J］. IEEE Transactions on Affective Computing， 2020， PP（99）：1. 10.1109/taffc.2020.2981446
19	MARECHAL C， MIKOAJEWSKI D， TYBUREK K， et al. Survey on AI-based multimodal methods for emotion detection［J］. High-Performance Modelling and Simulation for Big Data Applications， 2019， 16（1）： 186-197. 10.1007/978-3-030-16272-6_11
20	KUMARI J， RAJESH R， POOJA K. Facial expression recognition： a survey［C］// Proceedings of the 2nd International Symposium on Computer Vision and the Internet. Cham： Springer， 2015：486-491. 10.1016/j.procs.2015.08.011
21	朱娅妮，杜加友. 基于多特征融合的人脸表情识别［J］.杭州电子科技大学学报（自然科学版）， 2009， 29（5）：141-144.
	ZHU Y N， DU J Y. Facial expression recognition based on multi-feature fusion ［J］. Journal of Hangzhou Dianzi University （Natural Sciences）， 2009， 29（5）：141-144.
22	PANTIC M， VALSTAR M， RADEMAKER R， et al. Web-based database for facial expression analysis［C］// Proceedings of 2005 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2005： 5-10.
23	VALSTAR M， PANTIC M. Induced disgust， happiness and surprise： an addition to the MMI facial expression database［EB/OL］.［2020-06-20］. .
24	DHALL A， GOECKE R， GEDEON T， et al. A semi-automatic method for collecting richly labelled large facial expression databases from movies［C］// Proceedings of the 16th International Conference on Multimodal Interaction. New York： ACM， 2012： 34-41. 10.1109/mmul.2012.26
25	CIREGAN D， MEIER U， SCHMIDHUBER J. Multi-column deep neural networks for image classification［C］// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2012： 3642-3649. 10.1109/cvpr.2012.6248110
26	BARGAL S A， BARSOUM E， FERRER C C， et al. Emotion recognition in the wild from videos using images［C］// Proceedings of the 18th ACM International Conference on Multimodal Interaction.New York： ACM， 2016： 433-436. 10.1145/2993148.2997627
27	HAMESTER D， BARROS P， WERMTER S. Face expression recognition with a 2-channel convolutional neural network［C］// Proceedings of the 2015 International Joint Conference on Neural Networks. Piscataway： IEEE， 2015： 1-8. 10.1109/ijcnn.2015.7280539
28	YANG H， CIFTCI U， YIN L. Facial expression recognition by de-expression residue learning［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 2168-2177. 10.1109/cvpr.2018.00231
29	CHEN J， KONRAD J， ISHWAR P. VGAN-based image representation learning for privacy-preserving facial expression recognition［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2018： 1570-1579. 10.1109/cvprw.2018.00207
30	YANG H， ZHANG Z， YIN L. Identity-adaptive facial expression recognition through expression regeneration using conditional generative adversarial networks［C］// Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition. Piscataway： IEEE， 2018： 294-301. 10.1109/fg.2018.00050
31	WANG K， PENG X， YANG J， et al. Suppressing uncertainties for large-scale facial expression recognition［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 6897-6906. 10.1109/cvpr42600.2020.00693
32	王建霞，陈慧萍，李佳泽，等.基于多特征融合卷积神经网络的人脸表情识别［J］.河北科技大学学报，2019，40（6）：540-547. 10.7535/hbkd.2019yx06012
	WANG J X， CHEN H P， LI J Z， et al. Face expression recognition based on multi-feature fusion convolutional neural network［J］. Journal of Hebei University of Science and Technology，2019，40（6）：540-547. 10.7535/hbkd.2019yx06012
33	GOODFELLOW I， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets［C］// Advances in Neural Information Processing Systems 27. Cambridge： MIT Press， 2014： 2672-2680.
34	RADFORD A， METZ L， CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks［EB/OL］.［2020-06-20］. .
35	LUCEY P， COHN J F， KANADE T， et al. The extended Cohn-Kanade dataset （CK+）： a complete dataset for action unit and emotion-specified expression［C］// Proceedings of the 2010 Computer Vision & Pattern Recognition Workshops. Piscataway： IEEE， 2010：94-101. 10.1109/cvprw.2010.5543262
36	LYONS M J， AKAMATSU S， KAMACHI M G， et al. Coding facial expressions with Gabor wavelets［C］// Proceedings of the 3rd International Conference on Automatic Face and Gesture Recognition. Piscataway： IEEE， 1998：200-205.
37	SAGONAS C， ANTONAKOS E， TZIMIROPOULOS G， et al. 300 faces In-The-Wild challenge： database and results［J］. Image & Vision Computing， 2016， 47（10）：3-18. 10.1016/j.imavis.2016.01.002
38	HE K， ZHANG X， REN S， et al. Identity mappings in deep residual networks［C］// Proceedings of the 2016 European Conference on Computer Vision. Cham： Springer， 2016：630-645. 10.1007/978-3-319-46493-0_38
39	SAXENA D， CAO J. Generative Adversarial Networks （GANs）： challenges， solutions， and future directions［J］. ACM Computing Surveys， 2022，54（3）：Article No. 63.
40	LIU S， WANG T， BAU D， et al. Diverse image generation via self-conditioned GANs［C］// Proceedings of the 2020 Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 14286-14295. 10.1109/cvpr42600.2020.01429
41	ZHENG Z， YANG F， TAN W， et al. Gabor feature-based face recognition using supervised locality preserving projection［J］. Signal Processing， 2017， 87（10）：2473-2483.
42	URTASUN R， DARRELL T. Discriminative Gaussian process latent variable model for classification ［C］// Proceedings of the 24th International Conference on Machine learning. New York： ACM， 2007：927-934. 10.1145/1273496.1273613
43	ZHONG G， LI W J， YEUNG D Y， et al. Gaussian process latent random field ［C］// Proceedings of the 24th AAAI Conference on Artificial Intelligence. Cham： Springer， 2010：679-684. 10.1609/aaai.v34i07.7000
44	BISHOP C. Pattern Recognition and Machine Learning［M］. Cham： Springer， 2006： 257-261. 10.1007/978-0-387-45528-0_7
45	ZHANG F， ZHANG T， MAO Q， et al. Joint pose and expression modeling for facial expression recognition［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3359-3368. 10.1109/cvpr.2018.00354

数据集	Angry	Disgust	Fear	Happy	Neutral	Sadness	Surprise	Contempt
GAN+Basic	800	653	750	800	740	731	780	346
FER2013	3 995	56	496	895	653	415	607	—
CK+	135	177	75	207	—	84	249	54
JAFFE	—	29	31	30	30	30	30	—

数据集	Angry	Disgust	Fear	Happy	Neutral	Sadness	Surprise	Contempt
GAN+Basic	800	653	750	800	740	731	780	346
FER2013	3 995	56	496	895	653	415	607	—
CK+	135	177	75	207	—	84	249	54
JAFFE	—	29	31	30	30	30	30	—

模型	FER2013	CK+	JAFFE
LPP	0.752	0.760	0.798
D-GPLVM	0.779	0.797	0.850
GPLRF	0.793	0.829	0.874
GMLDA	0.817	0.834	0.882
AlexNet	0.536	0.557	0.665
VGG13	0.621	0.594	0.708
VGG16	0.653	0.674	0.726
ResNet18	0.648	0.665	0.730
ResNet34	0.674	0.673	0.744
ResNet18*	0.695	0.691	0.738
ResNet34*	0.736	0.748	0.756
EE⁃GAN	0.821	0.848	0.915

模型	FER2013	CK+	JAFFE
LPP	0.752	0.760	0.798
D-GPLVM	0.779	0.797	0.850
GPLRF	0.793	0.829	0.874
GMLDA	0.817	0.834	0.882
AlexNet	0.536	0.557	0.665
VGG13	0.621	0.594	0.708
VGG16	0.653	0.674	0.726
ResNet18	0.648	0.665	0.730
ResNet34	0.674	0.673	0.744
ResNet18*	0.695	0.691	0.738
ResNet34*	0.736	0.748	0.756
EE⁃GAN	0.821	0.848	0.915

模型	FER2013	CK+	JAFFE
VGG13+VGG16	0.758	0.769	0.794
VGG13+ResNet18	0.762	0.775	0.806
VGG16+ResNet18	0.765	0.780	0.812
Ens-Net	0.774	0.783	0.827
EE-GAN	0.821	0.848	0.915

EE-GAN：facial expression recognition method based on generative adversarial network and network integration

基于生成对抗网络和网络集成的面部表情识别方法EE-GAN

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 45

Related Articles 15

Recommended Articles

Metrics

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[3]	Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257.
[4]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[5]	Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977.
[6]	Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628.
[7]	Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604.
[8]	Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444.
[9]	Haoran WANG, Dan YU, Yuli YANG, Yao MA, Yongle CHEN. Domain transfer intrusion detection method for unknown attacks on industrial control systems [J]. Journal of Computer Applications, 2024, 44(4): 1158-1165.
[10]	Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751.
[11]	Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944.
[12]	Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744.
[13]	Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937.
[14]	Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721.
[15]	Qiaoling HUANG, Bochuan ZHENG, Zicheng DING, Zedong WU. Improved image inpainting network incorporating supervised attention module and cross-stage feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 572-579.