Multi-pose face reconstruction and recognition based on multi-task learning
OUYANG Ning1,2, MA Yutao2, LIN Leping1,2
1. Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education(Guilin University of Electronic Technology), Guilin Guangxi 541004, China; 2. School of Information and Communication, Guilin University of Electronic Technology, Guilin Guangxi 541004, China
Abstract:To circumvent the influence of pose variance on face recognition performance and considerable probability of losing the facial local detail information in the process of pose recovery, a multi-pose face reconstruction and recognition method based on multi-task learning was proposed, namely Multi-task Learning Stacked Auto-encoder (MtLSAE). Considering the correlation between pose recovery and retaining local detail information, multi-task learning mechanism was used and sparse auto-encoder with non-negativity constraints was introduced by MtLSAE to learn part features of the face when recovering frontal images using step-wise approach. And then the whole net framework was learned by sharing parameters between above two related tasks. Finally, Fisherface was used for dimensionality reduction and extracting discriminative features of reconstructed positive face image, and the nearest neighbor classifier was used for recognition. The experimental results demonstrate that MtLSAE achieves good pose reconstruction quality and makes facial local texture information clear; on the other hand, it also achieves higher recognition rate than some classical methods such as Local Gabor Binary Pattern(LGBP), View-Based Active Appearance (VAAM) and Stacked Progressive Auto-encoder (SPAE).
欧阳宁, 马玉涛, 林乐平. 基于多任务学习的多姿态人脸重建与识别[J]. 计算机应用, 2017, 37(3): 896-900.
OUYANG Ning, MA Yutao, LIN Leping. Multi-pose face reconstruction and recognition based on multi-task learning. Journal of Computer Applications, 2017, 37(3): 896-900.
[1] TAN X, TRIGGS B. Enhanced local texture feature sets for face recognition under difficult lighting conditions[J]. IEEE Transactions on Image Processing, 2010, 19(6):1635-1650. [2] HUANG G B, RAMESH M, BERG T, et al. Labeled faces in the wild:a database for studying face recognition in unconstrained environments[R]. Cambridge:University of Massachusetts, 2007:49. [3] GÜNTHER M, COSTA-PAZO A, DING C, et al. The 2013 face recognition evaluation in mobile environment[C]//ICB 2013:Proceedings of the 2013 International Conference on Biometrics. Piscataway, NJ:IEEE, 2013:1-7. [4] ZHANG W, SHAN S, GAO W, et al. Local Gabor Binary Pattern Histogram Sequence (LGBPHS):a novel non-statistical model for face representation and recognition[C]//ICCV'05:Proceedings of the Tenth IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2005, 1:786-791. [5] ASTHANA A, MARKS T K, JONES M J, et al. Fully automatic pose-invariant face recognition via 3D pose normalization[C]//ICCV'11:Proceedings of the 2011 International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2011:937-944. [6] HO H T, CHELLAPPA R. Pose-invariant face recognition using Markov random fields[J]. IEEE Transactions on Image Processing, 2013, 22(4):1573-1584. [7] ZHU Z, LUO P, WANG X, et al. Deep learning identity-preserving face space[C]//ICCV'13:Proceedings of the 2013 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2013:113-120. [8] ZHU Z, LUO P, WANG X, et al. Multi-view perceptron:a deep model for learning face identity and view representations[C]//NIPS 2014:Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press, 2014:217-225. [9] BENGIO Y. Learning deep architectures for AI[J]. Foundations and Trends in Machine Learning, 2009, 2(1):1-127. [10] KAN M, SHAN S, CHANG H, et al. Stacked Progressive Auto-Encoders (SPAE) for face recognition across poses[C]//CVPR'14:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition.Washington, DC:IEEE Computer Society, 2014:1883-1890. [11] SHIELDS T J, AMER M R, EHRLICH M, et al. Action-affect classification and morphing using multi-task representation learning[J/OL]. arXiv preprint arXiv:1603.06554, 2016[2016-03-21]. https://arxiv.org/abs/1603.06554. [12] ARGYRIOU A, EVGENIOU T, PONTIL M. Multi-task feature learning[C]//NIPS 2006:Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press, 2007, 19:41-48. [13] HOSSEINI-ASL E, ZURADA J M, NASRAOUI O. Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 27(12):1-13. [14] BELHUMEUR P N, HESPANHA J P, KRIEGMAN D J. Eigenfaces vs. fisherfaces:recognition using class specific linear projection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7):711-720. [15] NAIR V, HINTON G E. Rectified linear units improve restricted Holtzmann machines[C]//ICML-10:Proceedings of the 27th International Conference on Machine Learning. Haifa:Omnipress, 2010:807-814. [16] GRAVELINES C. Deep learning via stacked sparse autoencoders for automated voxel-wise brain parcellation based on functional connectivity[D]. Ontario, Canada:The University of Western Ontario, 2014:1-76. [17] LEE H, EKANADHAM C, NG A Y. Sparse deep belief net model for visual area V2[C]//NIPS 2007:Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press, 2008:873-880. [18] NG A, NGIAM J, FOO C Y, et al. UFLDL Tutorial[EB/OL]. (2013-04-07)[2016-08-26].http://deeplearning.stanford.edu/wiki/index.php/Gradient_checking_and_advanced_optimization. [19] GROSS R, MATTHEWS I, COHN J, et al. The CMU multi-pose, illumination, and expression (Multi-PIE) face database, TR-07-08[R]. Pittsburgh:CMU Robotics Institute, 2007. [20] 李航.统计学习方法[M].北京:清华大学出版社,2012:14-15. (LI H. Statical Learning Methods[M]. Beijing:Tsinghua University Press, 2012:14-15.)