Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (1): 216-223.DOI: 10.11772/j.issn.1001-9081.2025010051
• Multimedia computing and computer simulation • Previous Articles Next Articles
Jinyu LIANG1, Hongjuan GAO1,2(
), Xiaofei DU1
Received:2025-01-15
Revised:2025-03-26
Accepted:2025-03-26
Online:2026-01-10
Published:2026-01-10
Contact:
Hongjuan GAO
About author:LIANG Jinyu, born in 2000, M. S. candidate. His research interests include computer vision, graphics and image processing.Supported by:通讯作者:
高宏娟
作者简介:梁瑾裕(2000—),男,宁夏中卫人,硕士研究生,主要研究方向:计算机视觉、图形图像处理基金资助:CLC Number:
Jinyu LIANG, Hongjuan GAO, Xiaofei DU. 3D face generation method based on latent feature enhancement for disentanglement[J]. Journal of Computer Applications, 2026, 46(1): 216-223.
梁瑾裕, 高宏娟, 杜晓飞. 基于潜在特征增强进行解耦的三维人脸生成方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 216-223.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025010051
| 层序号 | 层名称 | 输入大小 | 输出大小 |
|---|---|---|---|
| 1 | 螺旋卷积层[ | 71 926 | 71 926 |
| 实例归一化层 | 71 926 | 71 926 | |
| 非线性层 | 71 926 | 71 926 | |
| 下采样层 | 71 926 | 17 982 | |
| 2 | 螺旋卷积层[ | 17 982 | 17 982 |
| 实例归一化层 | 17 982 | 17 982 | |
| 非线性层 | 17 982 | 17 982 | |
| 下采样层 | 17 982 | 4 496 | |
| 3 | 螺旋卷积层[ | 4 496 | 4 496 |
| 实例归一化层 | 4 496 | 4 496 | |
| 非线性层 | 4 496 | 4 496 | |
| 下采样层 | 4 496 | 1 124 | |
| 4 | 螺旋卷积层[ | 1 124 | 1 124 |
| 实例归一化层 | 1 124 | 1 124 | |
| 非线性层 | 1 124 | 1 124 | |
| 下采样层 | 1 124 | 281 | |
| 5 | 螺旋卷积层[ | 281 | 281 |
| 实例归一化层 | 281 | 281 | |
| 非线性层 | 281 | 281 | |
| 下采样层 | 281 | 60 | |
| 6 | 全连接 | 60 | 60 |
Tab. 1 Encoder layer input and output
| 层序号 | 层名称 | 输入大小 | 输出大小 |
|---|---|---|---|
| 1 | 螺旋卷积层[ | 71 926 | 71 926 |
| 实例归一化层 | 71 926 | 71 926 | |
| 非线性层 | 71 926 | 71 926 | |
| 下采样层 | 71 926 | 17 982 | |
| 2 | 螺旋卷积层[ | 17 982 | 17 982 |
| 实例归一化层 | 17 982 | 17 982 | |
| 非线性层 | 17 982 | 17 982 | |
| 下采样层 | 17 982 | 4 496 | |
| 3 | 螺旋卷积层[ | 4 496 | 4 496 |
| 实例归一化层 | 4 496 | 4 496 | |
| 非线性层 | 4 496 | 4 496 | |
| 下采样层 | 4 496 | 1 124 | |
| 4 | 螺旋卷积层[ | 1 124 | 1 124 |
| 实例归一化层 | 1 124 | 1 124 | |
| 非线性层 | 1 124 | 1 124 | |
| 下采样层 | 1 124 | 281 | |
| 5 | 螺旋卷积层[ | 281 | 281 |
| 实例归一化层 | 281 | 281 | |
| 非线性层 | 281 | 281 | |
| 下采样层 | 281 | 60 | |
| 6 | 全连接 | 60 | 60 |
| 层序号 | 层名称 | 输入大小 | 输出大小 |
|---|---|---|---|
| 1 | 全连接层 | 60 | 60 |
| 2 | 螺旋卷积层[ | 60 | 281 |
| 上采样层 | 281 | 281 | |
| 3 | 螺旋卷积层[ | 281 | 1 124 |
| 上采样层 | 1 124 | 1 124 | |
| 4 | 螺旋卷积层[ | 1 124 | 4 496 |
| 上采样层 | 4 496 | 4 496 | |
| 5 | 螺旋卷积层[ | 4 496 | 17 982 |
| 上采样层 | 17 982 | 17 982 | |
| 6 | 螺旋卷积层[ | 17 982 | 71 926 |
| 上采样层 | 71 926 | 71 926 |
Tab. 2 Decoder layer input and output
| 层序号 | 层名称 | 输入大小 | 输出大小 |
|---|---|---|---|
| 1 | 全连接层 | 60 | 60 |
| 2 | 螺旋卷积层[ | 60 | 281 |
| 上采样层 | 281 | 281 | |
| 3 | 螺旋卷积层[ | 281 | 1 124 |
| 上采样层 | 1 124 | 1 124 | |
| 4 | 螺旋卷积层[ | 1 124 | 4 496 |
| 上采样层 | 4 496 | 4 496 | |
| 5 | 螺旋卷积层[ | 4 496 | 17 982 |
| 上采样层 | 17 982 | 17 982 | |
| 6 | 螺旋卷积层[ | 17 982 | 71 926 |
| 上采样层 | 71 926 | 71 926 |
| 方法 | Diversity ( | JSD ( | MMD ( | COV/%( | VP/%( | 训练时间/min( |
|---|---|---|---|---|---|---|
| β-VAE[ | 3.93 | 6.31 | 1.06 | 59.07 | 65.92 | 106 |
| DIP-VAE-I[ | 3.12 | 11.15 | 0.93 | 52.82 | 50.60 | 108 |
| LSGAN[ | 5.92 | 2.52 | 1.54 | 43.95 | 71.39 | 374 |
| WGAN[ | 4.46 | 5.41 | 1.26 | 56.65 | 77.15 | 313 |
| SD-VAE[ | 4.13 | 4.59 | 1.50 | 64.60 | 80.71 | 441 |
| LED-VAE[ | 5.22 | 2.31 | 2.24 | 49.79 | 81.34 | 342 |
| SHLED-VAE[ | 5.13 | 3.32 | 1.70 | 46.69 | 83.21 | 288 |
| LFED | 5.26 | 2.34 | 1.84 | 47.58 | 95.67 | 388 |
Tab. 3 Comparison of experimental results from different methods
| 方法 | Diversity ( | JSD ( | MMD ( | COV/%( | VP/%( | 训练时间/min( |
|---|---|---|---|---|---|---|
| β-VAE[ | 3.93 | 6.31 | 1.06 | 59.07 | 65.92 | 106 |
| DIP-VAE-I[ | 3.12 | 11.15 | 0.93 | 52.82 | 50.60 | 108 |
| LSGAN[ | 5.92 | 2.52 | 1.54 | 43.95 | 71.39 | 374 |
| WGAN[ | 4.46 | 5.41 | 1.26 | 56.65 | 77.15 | 313 |
| SD-VAE[ | 4.13 | 4.59 | 1.50 | 64.60 | 80.71 | 441 |
| LED-VAE[ | 5.22 | 2.31 | 2.24 | 49.79 | 81.34 | 342 |
| SHLED-VAE[ | 5.13 | 3.32 | 1.70 | 46.69 | 83.21 | 288 |
| LFED | 5.26 | 2.34 | 1.84 | 47.58 | 95.67 | 388 |
| 模块 | VP/%( |
|---|---|
| None | 81.34 |
| PA | 84.50 |
| VQ | 83.84 |
| VQ+PA | 90.27 |
| EC | 87.77 |
| VQ+PA+EC | 95.67 |
Tab. 4 Experimental results of each module in VP metric
| 模块 | VP/%( |
|---|---|
| None | 81.34 |
| PA | 84.50 |
| VQ | 83.84 |
| VQ+PA | 90.27 |
| EC | 87.77 |
| VQ+PA+EC | 95.67 |
| [1] | SILVA A G DA, MENDES GOMES M V, WINKLER I. Virtual reality and digital human modeling for ergonomic assessment in industrial product development: a patent and literature review [J]. Applied Sciences, 2022, 12(3): No.1084. |
| [2] | HE Q, LI L, LI D, et al. From digital human modeling to human digital twin: framework and perspectives in human factors [J]. Chinese Journal of Mechanical Engineering, 2024, 37(1): No.9. |
| [3] | YANG Y, ZHANG H, FERNÁNDEZ A B, et al. Digitalization of 3-D human bodies: a survey [J]. IEEE Transactions on Consumer Electronics, 2024, 70(1): 3152-3166. |
| [4] | FOTI S, KOO B, STOYANOV D, et al. 3D shape variational autoencoder latent disentanglement via mini-batch feature swapping for bodies and faces [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 18709-18718. |
| [5] | FOTI S, KOO B, STOYANOV D, et al. 3D generative model latent disentanglement via local eigenprojection [J]. Computer Graphics Forum, 2023, 42(6): No.e14793. |
| [6] | 赵世杰,梁瑾裕,杜晓飞,等.球谐函数局部特征投影的三维人脸生成方法[J].计算机辅助设计与图形学学报, 2025, 37(3): 385-395. |
| ZHAO S J, LIANG J Y, DU X F, et al. 3D face generation method based on local feature projection of spherical harmonic [J]. Journal of Computer-Aided Design and Computer Graphics, 2025, 37(3): 385-395. | |
| [7] | HUANG Z. New face recognition technologies based on 3DMM [C]// Proceedings of the SPIE 12153, International Conference on Artificial Intelligence, Virtual Reality, and Visualization. Bellingham, WA: SPIE, 2021: No.1215315. |
| [8] | ZHAO Y, CAO X, LIU S, et al. A facial expression transfer method based on 3DMM and diffusion models [C]// Proceedings of the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2024: 3145-3149. |
| [9] | ZHANG H, REN Y, CHEN Y, et al. Exploiting multiple guidance from 3DMM for face reenactment [EB/OL]. [2024-04-28]. . |
| [10] | TRAN L, LIU X. Nonlinear 3D face morphable model [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7346-7355. |
| [11] | BOURITSAS G, BOKHNYAK S, PLOUMPIS S, et al. Neural 3D morphable models: spiral convolutional networks for 3D shape representation learning and generation [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 7212-7221. |
| [12] | BLANZ V, VETTER T. A morphable model for the synthesis of 3D faces [J]. Seminal Graphics Papers: Pushing the Boundaries, 2023, 2: No.18. |
| [13] | WANG X, LU H, LIU X, et al. Dynamic coordination of miscible polymer blends towards highly designable shape memory effect [J]. Polymer, 2020, 208: No.122946. |
| [14] | ALEXA M, COHEN-OR D, LEVIN D. As-rigid-as-possible shape interpolation [C]// Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2000: 157-164. |
| [15] | ONIZUKA H, THOMAS D, UCHIYAMA H, et al. Landmark-guided deformation transfer of template facial expressions for automatic generation of avatar blend-shapes [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway: IEEE, 2019: 2100-2108. |
| [16] | RACKOVIĆ S, SOARES C, JAKOVETIĆ D, et al. Clustering of the blendshape facial model [C]// Proceedings of the 29th European Signal Processing Conference. Piscataway: IEEE, 2021: 1556-1560. |
| [17] | MING X, LI J, LING J, et al. High-quality mesh blendshape generation from face videos via neural inverse rendering [C]// Proceedings of the 2024 European Conference on Computer Vision, LNCS 15128. Cham: Springer, 2025: 106-125. |
| [18] | GECER B, PLOUMPIS S, KOTSIA I, et al. GANFIT: generative adversarial network fitting for high fidelity 3D face reconstruction [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1155-1164. |
| [19] | MOSCHOGLOU S, PLOUMPIS S, NICOLAOU M A, et al. 3DFaceGAN: adversarial nets for 3D face representation, generation, and translation [J]. International Journal of Computer Vision, 2020, 128(10/11): 2534-2551. |
| [20] | LAN Y, MENG X, YANG S, et al. Self-supervised geometry-aware encoder for style-based 3D GAN inversion [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 20940-20949. |
| [21] | LIU Z, LI M, ZHANG Y, et al. Fine-grained face swapping via regional GAN inversion [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 8578-8587. |
| [22] | RAI A, GUPTA H, PANDEY A, et al. Towards realistic generative 3D face models [C]// Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2024: 3726-3736. |
| [23] | SONG X, FENG X, ZHU L, et al. SNP site-drug association prediction algorithm based on denoising variational auto-encoder [J]. Journal of Measurement Science and Instrumentation, 2022, 13(3): 300-308. |
| [24] | KIM S U, ROH J, IM H, et al. Anisotropic SpiralNet for 3D shape completion and denoising [J]. Sensors, 2022, 22(17): No.6457. |
| [25] | DEY R, BODDETI V N. Generating diverse 3D reconstructions from a single occluded face image [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 1537-1547. |
| [26] | ZHANG W, CUNX, WANG X, et al. SadTalker: learning realistic 3D motion coefficients for stylized audio-driven single image talking face animation [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 8652-8661. |
| [27] | TAN Q, ZHANG L X, YANG J, et al. Variational autoencoders for localized mesh deformation component analysis [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(10): 6297-6310. |
| [28] | YANG J, MO K, LAI Y K, et al. DSG-Net: learning disentangled structure and geometry for 3D shape generation [J]. ACM Transactions on Graphics, 2023, 42(1): No.1. |
| [29] | LI G, YANG H, HUANG D, et al. 3D face modeling via weakly-supervised disentanglement network joint identity-consistency prior [C]// Proceedings of the IEEE 18th International Conference on Automatic Face and Gesture Recognition. Piscataway: IEEE, 2024: 1-10. |
| [30] | GONG S, CHEN L, BRONSTEIN M, et al. Spiralnet++: a fast and highly efficient mesh convolution operator [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE, 2019: 4141-4148. |
| [31] | REDEKOP E, PLEASURE M, WANG Z, et al. Codebook VQ-VAE approach for prostate cancer diagnosis using multiparametric MRI [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2024: 2365-2372. |
| [32] | NAIK, PARVAIZ AHMAD, ZOHREH ESKANDARI. Nonlinear dynamics of a three-dimensional discrete time delay neural network [J]. International Journal of Biomathematics, 2024, 17(6): No.2350057. |
| [33] | FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3141-3149. |
| [34] | YU X, TANG L, RAO Y, et al. Point-BERT: pre-training 3D point cloud transformers with masked point modeling [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 19291-19300. |
| [35] | PLOUMPIS S, WANG H, PEARS N, et al. Combining 3D morphable models: a large scale face-and-head model [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 10926-10935. |
| [36] | ACHLIOPTAS P, DIAMANTI O, MITLIAGKAS I, et al. Learning representations and generative models for 3D point clouds [C]// Proceedings of the 35th International Conference on Machine Learning. New York: JMLR.org, 2018: 40-49. |
| [37] | ZHU X, XU C, TAO D. Learning disentangled representations with latent variation predictability [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12355. Cham: Springer, 2020: 684-700. |
| [38] | HIGGINS I, MATTHEY L, PAL A, et al. β-VAE: learning basic visual concepts with a constrained variational framework [EB/OL]. [2024-04-28]. . |
| [39] | KUMAR A, SATTIGERI P, BALAKRISHNAN A. Variational inference of disentangled latent concepts from unlabeled observations [EB/OL]. [2024-04-28]. . |
| [40] | MAO X, LI Q, XIE H, et al. Least squares generative adversarial networks [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2813-2821. |
| [41] | ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein generative adversarial networks [C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 214-223. |
| [1] | Zhihui ZAN, Yajing WANG, Ke LI, Zhixiang YANG, Guangyu YANG. Multi-feature fusion speech emotion recognition method based on SAA-CNN-BiLSTM network [J]. Journal of Computer Applications, 2026, 46(1): 69-76. |
| [2] | Ning CAO, Xin WEN, Yanrong HAO, Rui CAO. Lightweight motor imagery electroencephalogram decoding neural network with multi-domain feature fusion [J]. Journal of Computer Applications, 2026, 46(1): 289-296. |
| [3] | Yiming LIANG, Jing FAN, Wenze CHAI. Multi-scale feature fusion sentiment classification based on bidirectional cross attention [J]. Journal of Computer Applications, 2025, 45(9): 2773-2782. |
| [4] | Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010. |
| [5] | Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016. |
| [6] | Fang WANG, Jing HU, Rui ZHANG, Wenting FAN. Medical image segmentation network with content-guided multi-angle feature fusion [J]. Journal of Computer Applications, 2025, 45(9): 3017-3025. |
| [7] | Yimeng XI, Zhen DENG, Qian LIU, Libo LIU. Cross-modal information fusion for video-text retrieval [J]. Journal of Computer Applications, 2025, 45(8): 2448-2456. |
| [8] | Chengzhi YAN, Ying CHEN, Kai ZHONG, Han GAO. 3D object detection algorithm based on multi-scale network and axial attention [J]. Journal of Computer Applications, 2025, 45(8): 2537-2545. |
| [9] | Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719. |
| [10] | Liang CHEN, Xuan WANG, Kun LEI. Helmet wearing detection algorithm for complex scenarios based on cross-layer multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(7): 2333-2341. |
| [11] | Xiang WANG, Qianqian CUI, Xiaoming ZHANG, Jianchao WANG, Zhenzhou WANG, Jialin SONG. Wireless capsule endoscopy image classification model based on improved ConvNeXt [J]. Journal of Computer Applications, 2025, 45(6): 2016-2024. |
| [12] | Zonghang WU, Dong ZHANG, Guanyu LI. Multimodal fusion recommendation algorithm based on joint self-supervised learning [J]. Journal of Computer Applications, 2025, 45(6): 1858-1868. |
| [13] | Linjia SUN, Lei QIN, Meijin KANG, Yinglin WANG. Automatic speech segmentation algorithm based on syllable type recognition [J]. Journal of Computer Applications, 2025, 45(6): 2034-2042. |
| [14] | Ying HUANG, Shengmei GAO, Guang CHEN, Su LIU. Low-light image enhancement network combining signal-to-noise ratio guided dual-branch structure and histogram equalization [J]. Journal of Computer Applications, 2025, 45(6): 1971-1979. |
| [15] | Yali YANG, Ying LI, Yutao ZHANG, Peihua SONG. Review of multi-modal research methods for face recognition [J]. Journal of Computer Applications, 2025, 45(5): 1645-1657. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||