Abstract:In order to solve the problem that the currently widely used 3D morphable model has insufficient expression ability, resulting in poor generalization performance of the reconstructed 3D face model, a novel method for 3D face reconstruction and dense face alignment based on a single face image under unknown pose, expression and illumination was proposed. First, the existing 3D morphable model was improved by convolutional neural network to improve the expression ability of the 3D face model. Then, based on the smoothness of the face and the similarity of the image, a new loss function was proposed at the feature point and pixel level, and the weakly-supervised learning was used to train the convolutional neural network model. Finally, the trained network model was used to perform the 3D face reconstruction and dense face alignment. Experimental results show that, for 3D face reconstruction, the proposed model has the normalized mean error on AFLW2000-3D reduced to 2.25, and for dense face alignment, the proposed model has the normalized mean errors on AFLW2000-3D and AFLW-LFPA reduced to 3.80 and 3.34 respectively. Compared with the original method using 3D morphable model, the proposed model has the normalized mean errors reduced by 7.4% and 7.8% respectively in 3D face reconstruction and dense face alignment. Therefore, for face images with different lighting environments and angles, this network model is accurate in reconstruction and robust, and has high 3D face reconstruction and dense face alignment quality.
周健, 黄章进. 基于改进三维形变模型的三维人脸重建和密集人脸对齐方法[J]. 计算机应用, 2020, 40(11): 3306-3313.
ZHOU Jian, HUANG Zhangjin. 3D face reconstruction and dense face alignment method based on improved 3D morphable model. Journal of Computer Applications, 2020, 40(11): 3306-3313.
[1] KEMELMACHER-SHLIZERMAN I, BASRI R. 3D face reconstruction from a single image using a single reference face shape[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,33(2):394-405. [2] FYFFE G,JONES A,ALEXANDER O,et al. Driving highresolution facial scans with video performance capture[J]. ACM Transactions on Graphics,2014,34(1):No. 8. [3] 邓秋平, 赵宇明. 基于单幅正面照片的三维人脸重建方法[J]. 计算机工程,2010,36(20):176-178.(DENG Q P,ZHAO Y M. 3D face reconstruction method based on single frontal photo[J]. Computer Engineering,2010,36(20):176-178.) [4] ZHU X,LEI Z,LIU X,et al. Face alignment across large poses:a 3D solution[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:146-155. [5] TEWARI A,ZOLLHÖFER M,KIM H,et al. MoFA:model-based deep convolutional face autoencoder for unsupervised monocular reconstruction[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:3735-3744. [6] BLANZ V,VETTER T. A morphable model for the synthesis of 3D faces[C]//Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques. New York:ACM,1999:187-194. [7] RICHARDSON E,SELA M,KIMMEL R. 3D face reconstruction by learning from synthetic data[C]//Proceedings of the 4th International Conference on 3D Vision. Piscataway:IEEE,2016:460-469. [8] LIU F,ZHU R,ZENG D,et al. Disentangling features in 3D face shapes for joint face reconstruction and recognition[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:5216-5225. [9] TRAN A T,HASSNER T,MASI I,et al. Regressing robust and discriminative 3D morphable models with a very deep neural network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:5163-5172. [10] TEWARI A, ZOLLHÖFER M, GARRIDO P, et al. Selfsupervised multi-level face model learning for monocular reconstruction at over 250 Hz[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:2549-2559. [11] GENOVA K,COLE F,MASCHINOT A,et al. Unsupervised training for 3D morphable model regression[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:8377-8386. [12] COOTES T F,EDWARDS G J,TAYLOR C J. Active appearance models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(6):681-685. [13] GERIG T,MOREL-FORSTER A,BLUMER C,et al. Morphable face models-an open framework[C]//Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition. Piscataway:IEEE,2018:75-82. [14] BOOTH J,ROUSSOS A,ZAFEIRIOU S,et al. A 3D morphable model learnt from 10000 faces[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:5543-5552. [15] TRAN L,LIU X. On learning 3D face morphable model from inthe-wild images[EB/OL].[2020-04-02]. https://arxiv.org/pdf/1808.09560v1.pdf. [16] ROTH J,TONG Y,LIU X. Adaptive 3D face reconstruction from unconstrained photo collections[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:4197-4206. [17] JIN X, TAN X. Face alignment in-the-wild:a survey[J]. Computer Vision and Image Understanding,2017,162:1-22. [18] WANG N,GAO X,TAO D,et al. Facial feature point detection:acomprehensive survey[J]. Neurocomputing,2018,275:50-65. [19] ZHU X,YAN J,YI D,et al. Discriminative 3D morphable model fitting[C]//Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Piscataway:IEEE,2015:1-8. [20] JENI L A,COHN J F,KANADE T. Dense 3D face alignment from 2D videos in real-time[C]//Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Piscataway:IEEE,2015:1-8. [21] JACKSON A S,BULAT A,ARGYRIOU V,et al. Large pose 3D face reconstruction from a single image via direct volumetric CNN regression[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:1031-1039. [22] JIANG L,ZHANG J,DENG B,et al. 3D face reconstruction with geometry details from a single image[J]. IEEE Transactions on Image Processing,2018,27(10):4756-4770. [23] GUO Y,ZHANG J CAI J,et al. CNN-based real-time dense face reconstruction with inverse-rendered photo-realistic face images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(6):1294-1307. [24] DENG Y,YANG J,XU S,et al. Accurate 3D face reconstruction with weakly-supervised learning:from single image to image set[EB/OL].[2020-04-02]. https://arxiv.org/pdf/1903.08527.pdf. [25] RICHARDSON E,SELA M,OR-EL R,et al. Learning detailed face reconstruction from a single image[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:1259-1268. [26] FENG Y,WU F,SHAO X,et al. Joint 3D face reconstruction and dense alignment with position map regression network[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11218. Cham:Springer,2018:557-574. [27] SHI T,YUAN Y,FAN C,et al. Face-to-parameter translation for game character auto-creation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE,2019:161-170. [28] CHANG F,TRAN A T,HASSNER T,et al. Deep,landmarkfree FAME:face alignment,modeling,and expression estimation[J]. International Journal of Computer Vision,2019,127(6/7):930-956. [29] ASTHANA A, ZAFEIRIOU S, CHENG S, et al. Robust discriminative response map fitting with constrained local models[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2013:3444-3451. [30] LIANG Z, DING S, LIN L. Unconstrained facial landmark localization with backbone-branches fully-convolutional networks[EB/OL].[2020-04-02]. https://arxiv.org/abs/1507.03409. [31] PENG X,FERIS R S,WANG X,et al. A recurrent encoderdecoder network for sequential face alignment[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9905. Cham:Springer,2016:38-56. [32] MCDONAGH J,TZIMIROPOULOS G. Joint face detection and alignment with a deformable Hough transform model[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9914. Cham:Springer,2016:569-580. [33] GOU C,WU Y,WANG F,et al. Shape augmented regression for 3D face alignment[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9914. Cham:Springer, 2016:604-615. [34] SÁNTA Z,KATO Z. 3D face alignment without correspondences[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9914. Cham:Springer,2016:521-535. [35] DE BITTENCOURT ZAVAN F H,NASCIMENTO A C P,E SILVA L P,et al. 3D face alignment in the wild:a landmarkfree,nose-based approach[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9914. Cham:Springer, 2016:581-589. [36] YU R, SAITO S, LI H, et al. Learning dense facial correspondences in unconstrained images[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:4733-4742. [37] BULAT A,TZIMIROPOULOS G. Two-stage convolutional part heatmap regression for the 1st 3D face alignment in the wild (3DFAW) challenge[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9914. Cham:Springer, 2016:616-624. [38] BULAT A,TZIMIROPOULOS G. How far are we from solving the 2D & 3D face alignment problem?(and a dataset of 2300003D facial landmarks)[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:1021-1030. [39] JOURABLOO A,LIU X. Pose-invariant 3D face alignment[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE,2015:3694-3702. [40] CAO C, HOU Q, ZHOU K. Displaced dynamic expression regression for real-time facial tracking and animation[J]. ACM Transactions on Graphics,2014,33(4):No. 43. [41] LIU Y,JOURABLOO A,REN W,et al. Dense face alignment[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway:IEEE, 2017:1619-1628. [42] GÜLER R A, TRIGEORGIS G, ANTONAKOS E, et al. DenseReG:fully convolutional dense shape regression in-the-wild[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6799-6808. [43] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2020-04-02]. https://arxiv.org/pdf/1409.1556v1.pdf. [44] CAO C,WENG Y,ZHOU S,et al. FaceWarehouse:a 3D facial expression database for visualcomputing[J]. IEEE Transactions on Visualization and Computer Graphics,2014,20(3):413-425. [45] RAMAMOORTHI R, HANRAHAN P. A signal-processing framework for inverse rendering[C]//Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. New York:ACM,2001:117-128. [46] LIU Z,LUO P,WANG X,et al. Deep learning face attributes in the wild[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE,2015:3730-3738. [47] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL].[2020-04-02]. https://arxiv.org/pdf/1412.6980.pdf. [48] JOURABLOO A,LIU X. Large-pose face alignment via CNNbased dense 3D model fitting[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:4188-4196. [49] BHAGAVATULA C,ZHU C,LUU K,et al. Faster than real-time facial alignment:a 3D spatial transformer network approach in unconstrained poses[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:4000-4009.