面向人脸识别的多模态研究方法综述

doi:10.11772/j.issn.1001-9081.2024050568

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1645-1657.DOI: 10.11772/j.issn.1001-9081.2024050568

• 多媒体计算与计算机仿真 • 上一篇

面向人脸识别的多模态研究方法综述

杨雅莉¹, 黎英¹^,²^,³(), 章育涛¹, 宋佩华¹^,²^,³

^1.南宁师范大学物流管理与工程学院，南宁 530100
^2.广西高校智慧物流技术重点实验室（南宁师范大学），南宁 530100
^3.广西人机交互与智能决策重点实验室（南宁师范大学），南宁 530100

收稿日期:2024-05-09 修回日期:2024-07-10 接受日期:2024-07-31 发布日期:2024-08-23 出版日期:2025-05-10
通讯作者: 黎英
作者简介:杨雅莉（1998—），女，河南信阳人，硕士研究生，主要研究方向：人脸识别、计算机视觉
黎英（1973—），女，广西宁明人，副教授，博士，主要研究方向：医学影像处理、深度学习
章育涛（1998—），男，浙江温州人，硕士研究生，主要研究方向：图像分析、深度学习
宋佩华（1982—），男，江西抚州人，副教授，博士，主要研究方向：计算机图形学、深度学习。
基金资助:
国家自然科学基金资助项目(62062051);广西研究生教育创新计划项目(JGY2023222)

Review of multi-modal research methods for face recognition

Yali YANG¹, Ying LI¹^,²^,³(), Yutao ZHANG¹, Peihua SONG¹^,²^,³

^1.School of Logistics Management and Engineering，Nanning Normal University，Nanning Guangxi 530100，China
^2.Guangxi Key Laboratory of Intelligent Logistics Technology （Nanning Normal University），Nanning Guangxi 530100，China
^3.Guangxi Key Lab of Human-machine Interaction and Intelligent Decision （Nanning Normal University），Nanning Guangxi 530100，China

Received:2024-05-09 Revised:2024-07-10 Accepted:2024-07-31 Online:2024-08-23 Published:2025-05-10
Contact: Ying LI
About author:YANG Yali， born in 1998， M. S. candidate. Her research interests include face recognition， computer vision.
LI Ying， born in 1973， Ph. D.， associate professor. Her research interests include medical image processing， deep learning.
ZHANG Yutao， born in 1998， M. S. candidate. His research interests include image analysis， deep learning.
SONG Peihua， born in 1982， Ph. D.， associate professor. His research interests include computer graphics， deep learning.
Supported by:
National Natural Science Foundation of China(62062051);Innovation Program of Guangxi Graduate Education(JGY2023222)

摘要/Abstract

摘要：

多模态人脸识别技术能充分利用人脸特征或其他生物特征提高识别的鲁棒性和安全性，具有广泛的实际应用价值。由于目前的多模态人脸识别研究存在模态差距和模态信息难以高效融合等问题，因此根据多种信息模态和应用目的对现有的多模态人脸识别方法进行分类综述，以梳理研究中存在的问题，并探讨未来的发展方向。首先，将基于多源信息融合的多模态人脸识别研究按照数据处理的不同阶段分为传感器级、特征级、评分级和决策级，并归纳现有方法的优势、局限性和适用场景；其次，将信息增强多模态人脸识别研究按照被增强模态的不同分为2D-3D信息增强和3D-2D信息增强，并总结现有方法的优缺点；再次，归纳总结基于其他生物特征和面向反欺诈的多模态人脸识别方法，并简要介绍常用的多模态人脸识别数据集相关信息；最后，给出多模态人脸识别研究中存在的一些严峻挑战，并展望未来的研究方向。

关键词: 多模态人脸识别, 特征融合, 信息增强, 生物特征, 反欺诈

Abstract:

Multi-modal face recognition technology can fully utilize face features and other biometric features to enhance recognition robustness and security， and has broad practical application value. Current research on multi-modal face recognition has problems such as modal disparity and inefficient modal fusion. Therefore， based on multiple information modalities and application purposes， the existing multi-modal face recognition methods were classified and reviewed to sort out the problems in research and explore future development directions. Firstly， the multi-modal face recognition research based on multi-source information fusion was divided into sensor-level， feature-level， scoring-level， and decision-level ones according to different stages of data processing， and advantages， limitations， and applicable scenarios of the existing methods were summarized. Secondly， the research on information-enhanced multi-modal face recognition was categorized into 2D-3D and 3D-2D information enhancement ones according to different enhanced modalities， and advantages and disadvantages of the existing methods were summed up. Thirdly， multi-modal face recognition methods based on other biometric features and for anti-spoofing were summarized， and the relevant information of commonly used multi-modal face recognition datasets were introduced briefly. Finally， key challenges and future development directions were given and prospected.

Key words: multi-modal face recognition, feature fusion, information enhancement, biometric feature, anti-spoofing

中图分类号:

TP391

杨雅莉, 黎英, 章育涛, 宋佩华. 面向人脸识别的多模态研究方法综述[J]. 计算机应用, 2025, 45(5): 1645-1657.

Yali YANG, Ying LI, Yutao ZHANG, Peihua SONG. Review of multi-modal research methods for face recognition[J]. Journal of Computer Applications, 2025, 45(5): 1645-1657.

图/表 13

参考文献 90

1	PENG C， WANG N， LI J， et al. DLFace： deep local descriptor for cross-modality face recognition［J］. Pattern Recognition， 2019， 90： 161-171.
2	LIU D， GAO X， WANG N， et al. Coupled attribute learning for heterogeneous face recognition［J］. IEEE Transactions on Neural Networks and Learning Systems， 2020， 31（11）： 4699-4712.
3	徐遐龄，刘涛，田国辉，等. 有遮挡环境下的人脸识别方法综述［J］. 计算机工程与应用， 2021， 57（17）：46-60.
	XU X L， LIU T， TIAN G H， et al. Review of occlusion face recognition methods［J］. Computer Engineering and Applications， 2021， 57（17）： 46-60.
4	刘力，龚勇，赵国强. 三维人脸识别研究进展［J］. 计算机工程与应用， 2023， 59（23）：28-47.
	LIU L， GONG Y， ZHAO G Q. Research progress in three-dimensional face recognition［J］. Computer Engineering and Applications， 2023， 59（23）： 28-47.
5	YU Z， QIN Y， LI X， et al. Deep learning for face anti-spoofing： a survey［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2023， 45（5）： 5609-5631.
6	PATTNAIK I， DEV A， MOHAPATRA A K. A face recognition taxonomy and review framework towards dimensionality， modality and feature quality［J］. Engineering Applications of Artificial Intelligence， 2023， 126（Pt C）： No.107056.
7	QIN Z， ZHAO P， ZHUANG T， et al. A survey of identity recognition via data fusion and feature learning［J］. Information Fusion， 2023， 91： 694-712.
8	JIANG X， MA J， XIAO G， et al. A review of multimodal image matching： methods and applications［J］. Information Fusion， 2021， 73： 22-71.
9	ZHANG H， XU H， TIAN X， et al. Image fusion meets deep learning： a survey and perspective［J］. Information Fusion， 2021， 76： 323-336.
10	PUROHIT H， AJMERA P K. Optimal feature level fusion for secured human authentication in multimodal biometric system［J］. Machine Vision and Applications， 2021， 32： No.24.
11	ZHANG J， JIAO L， MA W， et al. Transformer based conditional GAN for multimodal image fusion［J］. IEEE Transactions on Multimedia， 2023， 25： 8988-9001.
12	LI W， ZHANG Y， WANG G， et al. DFENet： a dual-branch feature enhanced network integrating Transformers and convolutional feature learning for multimodal medical image fusion［J］. Biomedical Signal Processing and Control， 2023， 80（Pt 2）： No.104402.
13	CHANDRAKALA M， DEVI P D. Two-stage classifier for face recognition using HOG features［J］. Materials Today： Proceedings， 2021， 47（Pt 17）： 5771-5775.
14	AGGARWAL A， ALSHEHRI M， KUMAR M， et al. Principal component analysis， hidden Markov model， and artificial neural network inspired techniques to recognize faces［J］. Concurrency and Computation： Practice and Experience， 2021， 33（9）： No.e6157.
15	ABUSHAM E， IBRAHIM B， ZIA K， et al. Facial image encryption for secure face recognition system［J］. Electronics， 2023， 12（3）： No.774.
16	GUPTA S， THAKUR K， KUMAR M. 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions［J］. The Visual Computer， 2021， 37（3）： 447-456.
17	HAMMOUCHE R， ATTIA A， AKHROUF S， et al. Gabor filter bank with deep autoencoder based face recognition system［J］. Expert Systems with Applications， 2022， 197： No.116743.
18	KARANWAL S， DIWAKAR M. OD-LBP： orthogonal difference-local binary pattern for face recognition［J］. Digital Signal Processing， 2021， 110： No.102948.
19	BAYOUDH K， KNANI R， HAMDAOUI F， et al. A survey on deep multimodal learning for computer vision： advances， trends， applications， and datasets［J］. The Visual Computer， 2022， 38（8）： 2939-2970.
20	KAUR H， KOUNDAL D， KADYAN V. Image fusion techniques： a survey［J］. Archives of Computational Methods in Engineering， 2021， 28（7）： 4425-4447.
21	HUANG Z H， LI W J， WANG J， et al. Face recognition based on pixel-level and feature-level fusion of the top-level’s wavelet sub-bands［J］. Information Fusion， 2015， 22： 95-104.
22	TISTARELLI M， CADONI M， LAGORIO A， et al. Blending 2D and 3D face recognition［M］// BOURLAI T. Face recognition across the imaging spectrum. Cham： Springer， 2016： 305-331.
23	ZHANG H， LI Q， SUN Z. Adversarial learning semantic volume for 2D/3D face shape regression in the wild［J］. IEEE Transactions on Image Processing， 2019， 28（9）： 4526-4540.
24	CHEN P， LI X， WANG W. Improving occluded face recognition with image fusion［C］// Proceedings of the 13th International Congress on Image and Signal Processing， BioMedical Engineering and Informatics. Piscataway： IEEE， 2020： 259-265.
25	XIAO G， BAVIRISETTI D P， LIU G， et al. Feature-level image fusion［M］// Image fusion. Singapore： Springer， 2020： 103-147.
26	LUMINI A， NANNI L. Overview of the combination of biometric matchers［J］. Information Fusion， 2017， 33： 71-85.
27	JIANG L， ZHANG J， LI C， et al. RGB-D face recognition via spatial and channel attentions［C］// Proceedings of the IEEE 5th Advanced Information Technology， Electronic and Automation Control Conference. Piscataway： IEEE， 2021： 2037-2041.
28	SEPAS-MOGHADDAM A， CORREIA P L， NASROLLAHI K， et al. Light field based face recognition via a fused deep representation［C］// Proceedings of the IEEE 28th International Workshop on Machine Learning for Signal Processing. Piscataway： IEEE， 2018： 1-6.
29	ZHANG X， ZHAO Y， ZHANG H. MixNet face recognition how combing 2D and 3D data can increase the precision［J］. IOP Conference Series： Materials Science and Engineering， 2020， 782（5）： No.052037.
30	LI C， HUANG W， HUANG Y. Gabor Log-Euclidean Gaussian and its fusion with deep network based on self-attention for face recognition［J］. Applied Soft Computing， 2022， 116： No.108210.
31	GAO G， XU Z， LI J， et al. CTCNet： a CNN-transformer cooperation network for face image super-resolution［J］. IEEE Transactions on Image Processing， 2023， 32： 1978-1991.
32	TIONG L C O， KIM S T， RO Y M. Multimodal facial biometrics recognition： dual-stream convolutional neural networks with multi-feature fusion layers［J］. Image and Vision Computing， 2020， 102： No.103977.
33	CHAROQDOUZ E， HASSANPOUR H. Feature extraction from several angular faces using a deep learning based fusion technique for face recognition［J］. International Journal of Engineering， 2023， 36（8）： 1548-1555.
34	UPPAL H， SEPAS-MOGHADDAM A， GREENSPAN M， et al. Two-level attention-based fusion learning for RGB-D face recognition［C］// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway： IEEE， 2021： 10120-10127.
35	AlFAWWAZ B M， AL-SHATNAWI A， AL-SAQQAR F， et al. Face recognition system based on the multi-resolution singular value decomposition fusion technique［J］. International Journal of Data and Network Science， 2022， 6（4）： 1249-1260.
36	SUN Z， MIAO Y， JEON J Y， et al. Facial feature fusion convolutional neural network for driver fatigue detection［J］. Engineering Applications of Artificial Intelligence， 2023， 126（Pt C）： No.106981.
37	ZHANG J， YAN X， CHENG Z， et al. A face recognition algorithm based on feature fusion［J］. Concurrency and Computation： Practice and Experience， 2022， 34（14）： No.e5748.
38	陈北京，王鹏，喻乐延，等.注意力融合双流特征的局部GAN生成人脸检测算法［J］.东南大学学报（自然科学版），2023，53（3）：543-551.
	CHEN B J， WANG P， YU L Y， et al. Locally GAN-generated face detection algorithm based on dual-stream features fused by attention［J］. Journal of Southeast University （Natural Science Edition）， 2023， 53（3）： 543-551.
39	XU R， WANG K， DENG C， et al. Depth map denoising network and lightweight fusion network for enhanced 3D face recognition［J］. Pattern Recognition， 2024， 145： No.109936.
40	ZHANG P， TAN L， YANG Z， et al. Device-edge collaborative occluded face recognition method based on cross-domain feature fusion［J/OL］. Digital Communications and Networks ［2025-02-07］. .
41	SOLTANPOUR S， WU Q J. Multimodal 2D-3D face recognition using local descriptors： pyramidal shape map and structural context［J］. IET Biometrics， 2017， 6： 27-35.
42	OUAMANE A， BOUTELLAA E， BENGHERABI M， et al. A novel statistical and multiscale local binary feature for 2D and 3D face verification［J］. Computers and Electrical Engineering， 2017， 62： 68-80.
43	YANG H， WANG T， YIN L. Adaptive multimodal fusion for facial action units recognition［C］// Proceedings of the 28th ACM International Conference on Multimedia. New York： ACM， 2020： 2982-2990.
44	XU J， XUE X， WU Y， et al. Matching a composite sketch to a photographed face using fused HOG and deep feature models［J］. The Visual Computer， 2021， 37（4）： 765-776.
45	SINGH S， SINGH H， BUENO G， et al. A review of image fusion： Methods， applications and performance metrics［J］. Digital Signal Processing， 2023， 137： No.104020.
46	AISSAOUI A， MARTINET J. Bi-modal face recognition — how combining 2D and 3D clues can increase the precision［C］// Proceedings of the 10th International Conference on Computer Vision Theory and Applications — Volume 2： VISAPP. Setúbal： SciTePress， 2015： 559-564.
47	XIE Z， SHI L， LI Y. Two-stage fusion of local binary pattern and discrete cosine transform for infrared and visible face recognition［C］// Proceedings of the 2020 International Conference on Intelligent， Interactive Systems and Applications， AISC 1304. Cham： Springer， 2021： 967-975.
48	ALSHAHRANI A A， JAHA E S， ALOWIDI N. Fusion of hash-based hard and soft biometrics for enhancing face image database search and retrieval［J］. Computers， Materials and Continua， 2023， 77（3）：3489-3509.
49	SING J K， DEY A， GHOSH M. Confidence factor weighted Gaussian function induced parallel fuzzy rank-level fusion for inference and its application to face recognition［J］. Information Fusion， 2019， 47： 60-71.
50	KUMAR S， SINGH S K. Occluded thermal face recognition using Bag Of CNN （BoCNN）［J］. IEEE Signal Processing Letters， 2020， 27： 975-979.
51	TORKHANI G， LADGHAM A， SAKLY A， et al. A 3D-2D face recognition method based on extended Gabor wavelet combining curvature and edge detection［J］. Signal， Image and Video Processing， 2017， 11（5）： 969-976.
52	DANNER M， WEBER T， HUBER P， et al. Evolutional normal maps： 3D face representations for 2D-3D face recognition， face modelling and data augmentation［C］// Proceedings of the 17th International Conference on Computer Vision Theory and Applications — Volume 5： VISAPP. Setúbal： SciTePress 2022： 267-274.
53	NIU W， ZHAO Y， YU Z， et al. Research on a face recognition algorithm based on 3D face data and 2D face image matching［J］. Journal of Visual Communication and Image Representation， 2023， 91： No.103757.
54	JIN B， CRUZ L， GONÇALVES N. Pseudo RGB-D face recognition［J］. IEEE Sensors Journal， 2022， 22（22）： 21780-21794.
55	ZHU Y， GAO J， WU T， et al. Exploiting enhanced and robust RGB-D face representation via progressive multi-modal learning［J］. Pattern Recognition Letters， 2023， 166： 38-45.
56	KAKADIARIS I A， TODERICI G， EVANGELOPOULOS G， et al. 3D-2D face recognition with pose and illumination normalization［J］. Computer Vision and Image Understanding， 2017， 154： 137-151.
57	SANIL G， PRAKASH K， PRABHU S， et al. 2D-3D facial image analysis for identification of facial features using machine learning algorithms with hyper-parameter optimization for forensics applications［J］. IEEE Access， 2023， 11： 82521-82538.
58	DI MARTINO J M， SUZACQ F， DELBRACIO M， et al. Differential 3D facial recognition： adding 3D to your state-of-the-art 2D method［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（7）： 1582-1593.
59	WINARNO E， AMIN I H AL， HARTATI S， et al. Face recognition based on CNN 2D-3D reconstruction using shape and texture vectors combining［J］. Indonesian Journal of Electrical Engineering and Informatics， 2020， 8（2）： 378-384.
60	SARANGI P P， NAYAK D R， PANDA M， et al. A feature-level fusion based improved multimodal biometric recognition system using ear and profile face［J］. Journal of Ambient Intelligence and Humanized Computing， 2022， 13（4）： 1867-1898.
61	VEKARIYA V， JOSHI M， DIKSHIT S. Multi-biometric fusion for enhanced human authentication in information security［J］. Measurement： Sensors， 2024， 31： No.100973.
62	沈澍，张文昊，王汝传，等. 人脸和步态特征注意力融合的身份识别方法［J］. 小型微型计算机系统， 2024， 45（7）： 1695-1701.
	SHEN S， ZHANG W H， WANG R C， et al. Human face and gait feature attention fusion based identity recognition method［J］. Journal of Chinese Computer Systems， 2024， 45（7）： 1695-1701.
63	ZHANG X， CHENG D， JIA P， et al. An efficient android-based multimodal biometric authentication system with face and voice［J］. IEEE Access， 2020， 8： 102757-102772.
64	ALEEM S， YANG P， MASOOD S， et al. An accurate multi-modal biometric identification system for person identification via fusion of face and finger print［J］. World Wide Web， 2020， 23： 1299-1317.
65	THAREWAL S， MALCHE T， TIWARI P K， et al. Score-level fusion of 3D face and 3D ear for multimodal biometric human recognition［J］. Computational Intelligence and Neuroscience， 2022， 2022： No.3019194.
66	ABOZAID A， HAGGAG A， KASBAN H， et al. Multimodal biometric scheme for human authentication technique based on voice and face recognition fusion［J］. Multimedia Tools and Applications， 2019， 78（12）： 16345-16361.
67	HATTAB A， BEHLOUL A. Face-Iris multimodal biometric recognition system based on deep learning［J］. Multimedia Tools and Applications， 2024， 83（14）： 43349-43376.
68	AMMOUR B， BOUBCHIR L， BOUDEN T， et al. Face-iris multimodal biometric identification system［J］. Electronics， 2020， 9（1）： No.85.
69	ALAY N， AL-BAITY H H. Deep learning approach for multimodal biometric recognition system based on fusion of iris， face， and finger vein traits［J］. Sensors， 2020， 20（19）： No.5523.
70	MEHRAJ H， MIR A H. Feature vector extraction and optimisation for multimodal biometrics employing face， ear and gait utilising artificial neural networks［J］. International Journal of Cloud Computing， 2020， 9（2/3）： 131-149.
71	GUPTA K， WALIA G S， SHARMA K. Quality based adaptive score fusion approach for multimodal biometric system［J］. Applied Intelligence， 2020， 50： 1086-1099.
72	LOHITH M S， MANJUNATH Y S K， ESHWARAPPA M N. Multimodal biometric person authentication using face， ear and periocular region based on convolution neural networks［J］. International Journal of Image and Graphics， 2023， 23（2）： No.2350019.
73	KADHIM O N， ABDULAMEER M H. Biometric identification advances： unimodal to multimodal fusion of face， palm， and iris features［J］. Advances in Electrical and Computer Engineering， 2024， 24（1）：91-98.
74	GEORGE A， MOSTAANI Z， GEISSENBUHLER D， et al. Biometric face presentation attack detection with multi-channel convolutional neural network［J］. IEEE Transactions on Information Forensics and Security， 2020， 15： 42-55.
75	YU P， WANG J， CAO N， et al. Research on face anti-spoofing algorithm based on image fusion［J］. Computers， Materials and Continua， 2021， 68（3）： 3861-3876.
76	马欣，吉立新，李邵梅. 基于多尺度Transformer融合多域信息的伪造人脸检测［J］. 计算机科学， 2023， 50（10）：112-118.
	MA X， JI L X， LI S M. Forgery face detection based on multi-scale Transformer fusing multi-domain information［J］. Computer Science， 2023， 50（10）： 112-118.
77	LI Z， CUI Y， LIU W， et al. Construction and calibration of a stereo vision acquisition platform for multimodal face antispoofing［J］. Advances in Computer， Signals and Systems， 2023， 7（3）： 22-32.
78	TIAN Y， HUANG Y， ZHANG K， et al. Polarized image translation from nonpolarized cameras for multimodal face anti-spoofing［J］. IEEE Transactions on Information Forensics and Security， 2023， 18： 5651-5664.
79	LI C， LI Z， SUN J， et al. Middle-shallow feature aggregation in multimodality for face anti-spoofing［J］. Scientific Reports， 2023， 13： No.9870.
80	DENG P， GE C， WEI H， et al. Multimodal contrastive learning for face anti-spoofing［J］. Engineering Applications of Artificial Intelligence， 2024， 129： No.107600.
81	RAGHAVENDRA R， LI G. Multimodality for reliable single image based face morphing attack detection［J］. IEEE Access， 2022， 10： 82418-82433.
82	KONG C， ZHENG K， LIU Y， et al. M³FAS： an accurate and robust multimodal mobile face anti-spoofing system［J］. IEEE Transactions on Dependable and Secure Computing， 2024， 21（6）： 5650-5666.
83	BIAWAS A S， DEY S， AHIRWAR A K. 3sXcsNet： a framework for face presentation attack detection using deep learning［J］. Expert Systems with Applications， 2024， 243： No.122821.
84	PHILLIPS P J， FLYNN P J， SCRUGGS T， et al. Overview of the face recognition grand challenge［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway： IEEE， 2005： 947-954.
85	YIN L， WEI X， SUN Y， et al. A 3D facial expression database for facial behavior research［C］// Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition. Piscataway： IEEE， 2006： 211-216.
86	FALTEMIER T C， BOWYER K W， FLYNN P J. Using a multi-instance enrollment representation to improve 3D face recognition［C］// Proceedings of the 1st IEEE International Conference on Biometrics： Theory， Applications， and Systems. Piscataway： IEEE， 2007： 1-6.
87	SAVRAN A， ALYÜZ N， DIBEKLIOĞLU H， et al. Bosphorus database for 3D face analysis［C］// Proceedings of the 2008 European Workshop on Biometrics and Identity Management， LNCS 5372. Berlin： Springer， 2008： 47-56.
88	GUPTA S， CASTLEMAN K R， MARKEY M K， et al. Texas 3D face recognition database［C］// Proceedings of the 2010 IEEE Southwest Symposium on Image Analysis and Interpretation. Piscataway： IEEE， 2010： 97-100.
89	COLOMBO A， CUSANO C， SCHETTINI R. UMB-DB： a database of partially occluded 3D faces［C］// Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops. Piscataway： IEEE， 2011： 2113-2119.
90	XU C， TAN T， LI S， et al. Learning effective intrinsic features to boost 3D-based face recognition［C］// Proceedings of the 2006 European Conference on Computer Vision， LNCS 3952. Berlin： Springer， 2006： 416-427.

方法	优势	局限性	适用场景
特征串联	操作简单，时间成本低；不会造成单模态信息丢失，能够提升识别准确率	存在信息冗余、模态差距问题；会造成后续分类器性能降低	具备多种传感器或多模态信息来源的人脸识别
特征加权	对重要特征进行加权，提升了组合特征的质量	特征权重的设置对最终的融合特征影响较大	特征数量有限、信息冗余情况下的人脸识别
特征融合	根据多模态信息之间的相关性选择性地学习有用的信息，提升组合特征质量	存在单模态信息丢失问题；计算成本较高	特征数量较多、精确度要求高的人脸识别
特征层融合	获得具有更多面部细节的深度图像，进而获得更多互补特征；更好地捕捉不同模态特征间的相关性	网络分层提取特征会丢失部分信息；获取到的特征可解释性较差	单一传感器或复杂场景下的人脸识别

方法	优势	局限性	适用场景
特征串联	操作简单，时间成本低；不会造成单模态信息丢失，能够提升识别准确率	存在信息冗余、模态差距问题；会造成后续分类器性能降低	具备多种传感器或多模态信息来源的人脸识别
特征加权	对重要特征进行加权，提升了组合特征的质量	特征权重的设置对最终的融合特征影响较大	特征数量有限、信息冗余情况下的人脸识别
特征融合	根据多模态信息之间的相关性选择性地学习有用的信息，提升组合特征质量	存在单模态信息丢失问题；计算成本较高	特征数量较多、精确度要求高的人脸识别
特征层融合	获得具有更多面部细节的深度图像，进而获得更多互补特征；更好地捕捉不同模态特征间的相关性	网络分层提取特征会丢失部分信息；获取到的特征可解释性较差	单一传感器或复杂场景下的人脸识别

融合策略	模态	优势	局限性
特征级	人脸、耳朵	将不同生物特征提取的特征直接融合，充分利用信息，能够提高识别的准确性和鲁棒性；简单直观，易于理解，不需要复杂的算法和模型，实际应用中能够快速部署	不同生物特征的特征表达能力不同，可能导致一些特征对整体识别结果的贡献较小；无法捕捉特征间的复杂非线性关系，可能导致信息丢失
	人脸、指纹
	人脸、步态
评分级	人脸、语音	通过学习各种生物特征的权重或得分，动态地权衡不同特征对识别结果的贡献；根据具体应用场景和数据情况调整，具有较强的适应性	通常需要大量的标注数据进行训练，以学习各种生物特征的权重或得分，在实际应用中会受到限制；在数据量较小或特征间关系复杂时存在过拟合的风险
	人脸、指纹
	3D人脸、3D耳朵
混合策略	人脸、语音	能够综合利用特征级、评分级、决策级等不同融合策略的优势，克服各自的局限性；具有较强的灵活性，可以根据具体应用场景和需求选择合适的融合策略	需要同时考虑多种融合方法的组合和调优，增加了系统的复杂度；涉及多个融合方法的参数调优，时间成本较高
混合策略	人脸、虹膜		需要同时考虑多种融合方法的组合和调优，增加了系统的复杂度；涉及多个融合方法的参数调优，时间成本较高

融合策略	模态	优势	局限性
特征级	人脸、耳朵	将不同生物特征提取的特征直接融合，充分利用信息，能够提高识别的准确性和鲁棒性；简单直观，易于理解，不需要复杂的算法和模型，实际应用中能够快速部署	不同生物特征的特征表达能力不同，可能导致一些特征对整体识别结果的贡献较小；无法捕捉特征间的复杂非线性关系，可能导致信息丢失
	人脸、指纹
	人脸、步态
评分级	人脸、语音	通过学习各种生物特征的权重或得分，动态地权衡不同特征对识别结果的贡献；根据具体应用场景和数据情况调整，具有较强的适应性	通常需要大量的标注数据进行训练，以学习各种生物特征的权重或得分，在实际应用中会受到限制；在数据量较小或特征间关系复杂时存在过拟合的风险
	人脸、指纹
	3D人脸、3D耳朵
混合策略	人脸、语音	能够综合利用特征级、评分级、决策级等不同融合策略的优势，克服各自的局限性；具有较强的灵活性，可以根据具体应用场景和需求选择合适的融合策略	需要同时考虑多种融合方法的组合和调优，增加了系统的复杂度；涉及多个融合方法的参数调优，时间成本较高
混合策略	人脸、虹膜		需要同时考虑多种融合方法的组合和调优，增加了系统的复杂度；涉及多个融合方法的参数调优，时间成本较高

融合策略	方法	优势	局限性
传感器级	图像融合	结合多种传感器获取的不同模态数据，单一模态缺失不会造成太大影响；增强了系统的鲁棒性和抗欺诈性	使用多种传感器，成本增加；数据融合的过程较复杂，需要考虑数据的对齐、校准等
特征级	特征层融合	神经网络可以提取更丰富的特征，多个神经网络可以满足不同特征的处理需求；并行提取特征提高了处理效率	多个神经网络的设计和调优比单一网络更复杂；融合策略和参数设置较困难，计算成本较高
评分级	自适应融合	根据数据的特性和识别需求动态调整不同特征的权重，具有较强的适应性和鲁棒性；不需要人为设定特征权重或得分，自动化程度较高	需要大量的标注数据进行训练，以学习特征权重或得分；在数据量较小或特征间关系复杂情况下可能存在过拟合的风险
评分级	多层次融合	可以综合利用低层次的底层特征和高层次的抽象特征，从而更全面地描述数据的特征；在处理复杂数据时具有优势	融合策略和参数设置较复杂；会增加系统的计算资源需求

面向人脸识别的多模态研究方法综述

Review of multi-modal research methods for face recognition

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 90

相关文章 15

编辑推荐

Metrics

数据集	数据类型	主体数量	图片数	扫描设备	发布年份
FRGC v1.0^［84］	深度图	1 024	50 000	Minolta Vivid 3D 扫描仪	2002
FRGC v2.0^［84］	深度图	466	4 007	Minolta Vivid 3D 扫描仪	2005
BU-3DFE^［85］	网格	100	2 500	立体摄影、3DMD 数字化仪	2006
ND-2006^［86］	深度图	888	13 450	Minolta Vivid 910 测距扫描仪	2007
Bosphorus^［87］	点云	105	4 652	Inspeck Mega Capturor Ⅱ 三维扫描仪	2008
Texas 3D^［88］	深度图	118	1 149	MU-2立体成像系统	2010
UMB-DB^［89］	深度图	143	1 473	Minolta Vivid 900激光扫描仪	2011
CASIA 3D^［90］	深度图	123	4 059	Minolta Vivid 910 测距扫描仪	2015

[1]	周阳, 李辉. 基于语义和细节特征双促进的遥感影像建筑物提取网络[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1310-1316.
[2]	郭诗月, 党建武, 王阳萍, 雍玖. 结合注意力机制和多尺度特征融合的三维手部姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1293-1299.
[3]	王一丁, 王泽浩, 李耀利, 蔡少青, 袁媛. 多尺度2D-Adaboost的中药材粉末显微图像识别算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1325-1332.
[4]	李林昊, 韩冬, 董永峰, 李英双, 王振. 基于关联信息增强与关系平衡的场景图生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 953-962.
[5]	周浩, 王超, 崔国恒, 罗廷金. 基于多语义关联与融合的视觉问答模型[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 739-745.
[6]	马汉达, 吴亚东. 多域时空层次图神经网络的空气质量预测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 444-452.
[7]	何秋润, 胡节, 彭博, 李天源. 基于上下文信息的多尺度特征融合织物疵点检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 640-646.
[8]	张众维, 王俊, 刘树东, 王志恒. 多尺度特征融合与加权框融合的遥感图像目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 633-639.
[9]	宋鹏程, 郭立君, 张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 240-246.
[10]	刘赏, 周煜炜, 代娆, 董林芳, 刘猛. 融合注意力和上下文信息的遥感图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 292-300.
[11]	李瑞, 李贯峰, 胡德洲, 高文馨. 融合路径与子图特征的知识图谱多跳推理模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 32-39.
[12]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[13]	刘瑞华, 郝子赫, 邹洋杨. 基于多层级精细特征融合的步态识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2250-2257.
[14]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[15]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.