Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (9): 2568-2574.DOI: 10.11772/j.issn.1001-9081.2019030540

• Artificial intelligence • Previous Articles     Next Articles

Real-time facial expression recognition based on convolutional neural network with multi-scale kernel feature

LI Minze<sup>1</sup>, LI Xiaoxia<sup>1,2</sup>, WANG Xueyuan<sup>1,2</sup>, SUN Wei<sup>1</sup>   

  1. 1. School of Information Engineering, Southwest University of Science and Technology, Mianyang Sichuan 621010, China;
    2. Key Laboratory of Special Environmental Robotics in Sichuan Province(Southwest University of Science and Technology), Mianyang Sichuan 621010, China
  • Received:2019-04-03 Revised:2019-06-07 Online:2019-06-10 Published:2019-09-10
  • Supported by:

    This work is partially supported by the National Natural Science Foundation of China (61771411), the Sichuan Science and Technology Project (2019YJ0449), the Graduate Innovation Fund of Southwest University of Science and Technology (18ycx123).


李旻择1, 李小霞1,2, 王学渊1,2, 孙维1   

  1. 1. 西南科技大学 信息工程学院, 四川 绵阳 621010;
    2. 特殊环境机器人技术四川省重点实验室(西南科技大学), 四川 绵阳 621010
  • 通讯作者: 李小霞
  • 作者简介:李旻择(1992-),男,四川南充人,硕士研究生,CCF会员,主要研究方向:深度学习、计算机视觉;李小霞(1976-),女,四川安岳人,教授,博士,主要研究方向:模式识别、计算机视觉;王学渊(1974-),男,四川绵阳人,副教授,博士,主要研究方向:图像处理;孙维(1995-),男,四川达州人,硕士研究生,主要研究方向:图像处理。
  • 基金资助:



Aiming at the problems of insufficient generalization ability, poor stability and difficulty in meeting the real-time requirement of facial expression recognition, a real-time facial expression recognition method based on multi-scale kernel feature convolutional neural network was proposed. Firstly, an improved MSSD (MobileNet+Single Shot multiBox Detector) lightweight face detection network was proposed, and the detected face coordinates information was tracked by Kernel Correlation Filter (KCF) model to improve the detection speed and stability. Then, three linear bottlenecks of three different scale convolution kernels were used to form three branches. The multi-scale kernel convolution unit was formed by the feature fusion of channel combination, and the diversity feature was used to improve the accuracy of expression recognition. Finally, in order to improve the generalization ability of the model and prevent over-fitting, different linear transformation methods were used for data enhancement to augment the dataset, and the model trained on the FER-2013 facial expression dataset was migrated to the small sample CK+ dataset for retraining. The experimental results show that the recognition rate of the proposed method on the FER-2013 dataset reaches 73.0%, which is 1.8% higher than that of the Kaggle Expression Recognition Challenge champion, and the recognition rate of the proposed method on the CK+ dataset reaches 99.5%. For 640×480 video, the face detection speed of the proposed method reaches 158 frames per second, which is 6.3 times of that of the mainstream face detection network MTCNN (MultiTask Cascaded Convolutional Neural Network). At the same time, the overall speed of face detection and expression recognition of the proposed method reaches 78 frames per second. It can be seen that the proposed method can achieve fast and accurate facial expression recognition.

Key words: Facial Expression Recognition (FER), Convolutional Neural Network (CNN), face detection, Kernel Correlation Filter (KCF), transfer learning



关键词: 人脸表情识别, 卷积神经网络, 人脸检测, 核相关滤波, 迁移学习

CLC Number: