Facial landmark detection based on ResNeXt with asymmetric convolution and squeeze excitation

doi:10.11772/j.issn.1001-9081.2020111847

Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (9): 2741-2747.DOI: 10.11772/j.issn.1001-9081.2020111847

Special Issue: 多媒体计算与计算机仿真

• Multimedia computing and computer simulation • Previous Articles Next Articles

Facial landmark detection based on ResNeXt with asymmetric convolution and squeeze excitation

WANG Hebing, ZHANG Chunmei

School of Computer Science and Engineering, North Minzu University, Yinchuan Ningxia 750021, China

Received:2020-11-25 Revised:2021-02-26 Online:2021-09-10 Published:2021-09-15
Supported by:
This work is partially supported by the General Project of Ningxia Hui Autonomous Region Key Research and Development Plan (2019BDE0311), the First-Class Discipline Construction (Electronic Science and Technology Discipline) Funded Project in Ningxia Colleges and Universities (NXYLXK2017A07).

基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测

王贺兵, 张春梅

北方民族大学计算机科学与工程学院, 银川 750021

通讯作者: 张春梅
作者简介:王贺兵(1992-),男,河北邢台人,硕士研究生,主要研究方向:深度学习、人脸关键点检测;张春梅(1964-),女,宁夏银川人,教授,硕士,CCF会员,主要研究方向:视觉信号处理、遥感图像处理。
基金资助:
宁夏回族自治区重点研发计划一般项目（2019BDE0311）；宁夏高等学校一流学科建设（电子科学与技术学科）项目（NXYLXK2017A07）。

Abstract

Abstract: Cascaded Deep Convolutional Neural Network (DCNN) algorithm is the first model that uses Convolutional Neural Network (CNN) in facial landmark detection and the use of CNN improves the accuracy significantly. This strategy needs to perform regression processing to the data between the adjacent stages repeatedly, resulting in complex algorithm procedure. Therefore, a facial landmark detection algorithm based on Asymmetric Convolution-Squeeze Excitation-Next Residual Network (AC-SE-ResNeXt) was proposed with only single-stage regression to simplify the procedure and solve the non-real-time problem of data preprocessing between adjacent stages. In order to keep the accuracy, the Asymmetric Convolution (AC) module and the Squeeze-and-Excitation (SE) module were added to Next Residual Network (ResNeXt) block to construct the AC-SE-ResNeXt network model. At the same time, in order to fit faces in complex environments such as different illuminations, postures and expressions better, the AC-SE-ResNeXt network model was deepened to 101 layers. The trained model was tested on datasets BioID and LFPW respectively. The overall mean error rate of the model for the five-point facial landmark detection on BioID dataset was 1.99%, and the overall mean error rate of the model for the five-point facial landmark detection on LFPW dataset was 2.3%. Experimental results show that with the simplified algorithm procedure and end to end processing, the improved algorithm can keep the accuracy as cascaded DCNN algorithm, while has the robustness significantly increased.

Key words: facial landmark detection, Asymmetric Convolution (AC), Squeeze-and-Excitation (SE) module, Convolutional Neural Network (CNN), Next Residual Network (ResNeXt)

摘要： 级联深度卷积神经网络（DCNN）算法为首先在人脸关键点检测中使用卷积神经网络（CNN）的模型，CNN的使用使得检测精度得到极大的提升。针对该策略需要对相邻阶段间的数据反复进行回归处理使得算法流程十分复杂的问题，提出基于非对称卷积-压缩激发-次代残差网络（AC-SE-ResNeXt）的人脸关键点检测算法。所提算法仅使用单阶段回归，既避免了级联策略中多阶段回归的算法流程复杂性，又解决了相邻阶段间数据需要进行预处理的问题。为了不降低精度，在次代残差网络（ResNeXt）块的基础上添加了非对称卷积（AC）模块和压缩激发（SE）模块，构建了AC-SE-ResNeXt网络模型。同时，为了能够精确拟合在不同光照、姿态、表情等复杂环境下的人脸，将AC-SE-ResNeXt网络模型加深到101层。对训练好的模型分别在数据集BioID和LFPW上进行测试，其中该模型在BioID数据集上的人脸五点关键点检测的综合平均误差率为1.99%，在LFPW数据集上的人脸五点关键点检测的综合平均误差率为2.3%。实验结果表明，所改进的算法不但简化了算法流程使之能进行端到端处理，而且其精度与级联DCNN算法相当，鲁棒性也有明显提升。

关键词: 人脸关键点检测, 非对称卷积, 压缩激发模块, 卷积神经网络, 次代残差网络（ResNeXt）

CLC Number:

WANG Hebing, ZHANG Chunmei. Facial landmark detection based on ResNeXt with asymmetric convolution and squeeze excitation[J]. Journal of Computer Applications, 2021, 41(9): 2741-2747.

王贺兵, 张春梅. 基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测[J]. 计算机应用, 2021, 41(9): 2741-2747.

References

[1] 吴岳. 基于深度学习的人脸特征点定位和人脸识别算法的研究与实现[D]. 北京:北京邮电大学,2016:3-4.(WU Y. Face alignment and face verification based on deep learning[D]. Beijing:Beijing University of Posts and Telecommunications, 2016:3-4.)
[2] BARTLETT M S,LITTLEWORT G,FASEL I,et al. Real time face detection and facial expression recognition:development and applications to human computer interaction[C]//Proceedings of the 2003 Conference on Computer Vision and Pattern Recognition Workshop. Piscataway:IEEE,2003:53-53.
[3] ZHOU F,BRANDT J,LIN Z. Exemplar-based graph matching for robust facial landmark localization[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway:IEEE,2013:1025-1032.
[4] TAIGMAN Y,YANG M,RANZATO M,et al. DeepFace:closing the gap to human-level performance in face verification[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2014:1701-1708.
[5] 刘欣. 轻量级人脸关键点检测算法研究[D]. 合肥:安徽大学, 2020:5-6. (LIU X. Research on lightweight facial landmark detection[D]. Hefei:Anhui University,2020:5-6.)
[6] COOTES T F,TAYLOR C J,COOPER D H,et al. Active shape models-their training and application[J]. Computer Vision and Image Understanding,1995,61(1):38-59.
[7] COOTES T F,EDWARDS G J,TAYLOR C J. Active appearance models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(6):681-685.
[8] DOLLÁR P, WELINDER P, PERONA P. Cascaded pose regression[C]//Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2010:1078-1085.
[9] SUN Y,WANG X G,TANG X O. Deep convolutional network cascade for facial point detection[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2013:3476-3483.
[10] KRIZHEVSKY A,SUTSKEVER I,HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc.,2012:1097-1105.
[11] XIE S N,GIRSHICK R,DOLLÁR P,et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:5987-5995.
[12] DING X H,GUO Y C,DING G G,et al. ACNet:strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE,2019:1911-1920.
[13] HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:7132-7141.
[14] HE K M,ZHANG X Y,REN S Q,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:770-778.
[15] IOFFE S,SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning. New York:JMLR. org,2015:448-456.
[16] LIANG L, XIAO R, WEN F, et al. Face alignment via component-based discriminative search[C]//Proceedings of the 2008 European Conference on Computer Vision,LNCS 5303. Berlin:Springer,2008:72-85.
[17] VALSTAR M,MARTINEZ B,BINEFA X,et al. Facial point detection using boosted regression and graph models[C]//Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2010:2729-2736.
[18] ZHANG C,WEN F,YIN Q F. Microsoft research face SDK beta[EB/OL].[2020-11-24]. https://www.microsoft.com/en-us/research/project/microsoft-research-face-sdk-beta/.
[19] LUXAND. Detect and recognize faces with Luxand FaceSDK[EB/OL].[2020-11-24]. http://www.luxand.com/facesdk/.
[20] JESORSKY O,KIRCHBERG K J,FRISCHHOLZ R W. Robust face detection using the Hausdorff distance[C]//Proceedings of the 2001 International Conference on Audio-and Video-based Biometric Person Authentication,LNCS 2091. Berlin:Springer, 2001:90-95.
[21] BELHUMEUR P N,JACOBS D W,KRIEGMAN D J,et al. Localizing parts of faces using a consensus of exemplars[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013, 35(12):2930-2940.
[22] GLOROT X,BENGIO Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. New York:JMLR. org,2010:249-256.
[23] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL]. (2017-01-30)[2020-12-03]. https://arxiv.org/pdf/1412.6980.pdf.

Facial landmark detection based on ResNeXt with asymmetric convolution and squeeze excitation

基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	SONG Zhongshan, LIANG Jiarui, ZHENG Lu, LIU Zhenyu, TIE Jun. Remote sensing scene classification based on bidirectional gated scale feature fusion [J]. Journal of Computer Applications, 2021, 41(9): 2726-2735.
[2]	LI Kangkang, ZHANG Jing. Multi-layer encoding and decoding model for image captioning based on attention mechanism [J]. Journal of Computer Applications, 2021, 41(9): 2504-2509.
[3]	ZHANG Yongbin, CHANG Wenxin, SUN Lianshan, ZHANG Hang. Detection method of domains generated by dictionary-based domain generation algorithm [J]. Journal of Computer Applications, 2021, 41(9): 2609-2614.
[4]	ZHAO Hong, KONG Dongyi. Chinese description of image content based on fusion of image feature attention and adaptive attention [J]. Journal of Computer Applications, 2021, 41(9): 2496-2503.
[5]	XU Jianglang, LI Linyan, WAN Xinjun, HU Fuyuan. Indoor scene recognition method combined with object detection [J]. Journal of Computer Applications, 2021, 41(9): 2720-2725.
[6]	ZENG Xiangyin, ZHENG Bochuan, LIU Dan. Detection of left and right railway tracks based on deep convolutional neural network and clustering [J]. Journal of Computer Applications, 2021, 41(8): 2324-2329.
[7]	CAO Yuhong, XU Hai, LIU Sun'ao, WANG Zixiao, LI Hongliang. Review of deep learning-based medical image segmentation [J]. Journal of Computer Applications, 2021, 41(8): 2273-2287.
[8]	QIN Binbin, PENG Liangkang, LU Xiangming, QIAN Jiangbo. Research progress on driver distracted driving detection [J]. Journal of Computer Applications, 2021, 41(8): 2330-2337.
[9]	HUANG Chengcheng, DONG Xiaoxiao, LI Zhao. Deep pipeline 5×5 convolution method based on two-dimensional Winograd algorithm [J]. Journal of Computer Applications, 2021, 41(8): 2258-2264.
[10]	TAN Daoqiang, ZENG Cheng, QIAO Jinxia, ZHANG Jun. Shadow detection method based on hybrid attention model [J]. Journal of Computer Applications, 2021, 41(7): 2076-2081.
[11]	WU Guangli, LI Leiting, GUO Zhenzhou, WANG Chengxiang. Video summarization generation model based on improved bi-directional long short-term memory network [J]. Journal of Computer Applications, 2021, 41(7): 1908-1914.
[12]	GAO Qinquan, HUANG Bingcheng, LIU Wenzhe, TONG Tong. Bamboo strip surface defect detection method based on improved CenterNet [J]. Journal of Computer Applications, 2021, 41(7): 1933-1938.
[13]	YANG Su, OUYANG Zhi, DU Nisuo. Unsupervised parallel hash image retrieval based on correlation distance [J]. Journal of Computer Applications, 2021, 41(7): 1902-1907.
[14]	JIA Heming, LANG Chunbo, JIANG Zichao. Plant leaf disease recognition method based on lightweight convolutional neural network [J]. Journal of Computer Applications, 2021, 41(6): 1812-1819.
[15]	ZHAO Xiaohu, LI Xiao. Image captioning algorithm based on multi-feature extraction [J]. Journal of Computer Applications, 2021, 41(6): 1640-1646.