计算机应用 ›› 2019, Vol. 39 ›› Issue (12): 3659-3664.DOI: 10.11772/j.issn.1001-9081.2019040600

• 虚拟现实与多媒体计算 • 上一篇    下一篇

基于深度残差和特征金字塔网络的实时多人脸关键点定位算法

谢金衡, 张炎生   

  1. 广东海洋大学 电子与信息工程学院, 广东 湛江 524088
  • 收稿日期:2019-04-11 修回日期:2019-07-04 出版日期:2019-12-10 发布日期:2019-08-26
  • 作者简介:谢金衡(1998-),男,广东河源人,主要研究方向:目标检测、人脸检测与识别、姿态估计、行人重检测;张炎生(1962-),男,湖北天门人,副教授,主要研究方向:信息与通信工程、图像处理。

Real-time multi-face landmark localization algorithm based on deep residual and feature pyramid neural network

XIE Jinheng, ZHANG Yansheng   

  1. College of Electronic and Information Engineering, Guangdong Ocean University, Zhanjiang Guangdong 524088, China
  • Received:2019-04-11 Revised:2019-07-04 Online:2019-12-10 Published:2019-08-26
  • Contact: 张炎生

摘要: 针对人脸关键点定位算法需要分为人脸区域检测与单人脸关键点定位两个步骤,导致处理时间成倍增加的情况,提出一步到位的实时且准确的多人脸关键点定位算法。该算法将人脸关键点坐标生成对应的热度图作为数据标签,利用深度残差网络完成前期的图像特征提取,使用特征金字塔网络融合在不同网络深度中表征不同尺度感受野的信息特征,应用中间监督思想,级联多个预测网络由粗到精地一次性回归图中所有人脸的关键点,而无需人脸检测步骤。在保持高定位精度的同时,该算法完成一次前向传播只需要约0.0075 s (约每秒133帧),满足了实时人脸关键点定位的要求,且在WFLW测试集中取得了6.06%的平均误差与11.70%的错误率。

关键词: 残差网络, 特征金字塔网络, 实时人脸关键点定位, 中间监督

Abstract: Most face landmark detection algorithms include two steps:face detection and face landmark localization, increasing the processing time. Aiming at the problem, a one-step and real-time algorithm for multi-face landmark localization was proposed. The corresponding heatmaps were generated as data labels by the face landmark coordinates. Deep residual network was used to realize the early feature extraction of image and feature pyramid network was used to fuse the information features representing receptive fields with different scales in different network depths. And then based on intermediate supervision, multiple landmark prediction networks were cascaded to realize the one-step coarse-to-fine facial landmark regression without face detection. With high accuracy localization, a forward propagation of the proposed algorithm only takes about 0.0075 s (133 frames per second), satisfying the requirement of real-time facial landmark localization. And the proposed algorithm has achieved the mean error of 6.06% and failure rate of 11.70% on Wider Facial Landmarks in-the-Wild (WFLW) dataset.

Key words: residual network, feature pyramid network, real-time face landmark localization, intermediate supervision

中图分类号: