3D face generation method based on latent feature enhancement for disentanglement

doi:10.11772/j.issn.1001-9081.2025010051

Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (1): 216-223.DOI: 10.11772/j.issn.1001-9081.2025010051

• Multimedia computing and computer simulation • Previous Articles Next Articles

3D face generation method based on latent feature enhancement for disentanglement

Jinyu LIANG¹, Hongjuan GAO¹^,²(), Xiaofei DU¹

^1.School of Information Engineering，Ningxia University，Yinchuan Ningxia 750021，China
^2.Ningxia Key Laboratory of Artificial Intelligence and Information Security for “East Data West Computing” （Ningxia University），Yinchuan Ningxia 750021，China

Received:2025-01-15 Revised:2025-03-26 Accepted:2025-03-26 Online:2026-01-10 Published:2026-01-10
Contact: Hongjuan GAO
About author:LIANG Jinyu， born in 2000， M. S. candidate. His research interests include computer vision， graphics and image processing.
DU Xiaofei， born in 1999， M. S. candidate. His research interests include computer vision， graphics and image processing.
Supported by:
Key Research and Development Program of Ningxia(2023BDE03006)

基于潜在特征增强进行解耦的三维人脸生成方法

梁瑾裕¹, 高宏娟¹^,²(), 杜晓飞¹

^1.宁夏大学信息工程学院，银川 750021
^2.宁夏“东数西算”人工智能与信息安全重点实验室（宁夏大学），银川 750021

通讯作者: 高宏娟
作者简介:梁瑾裕（2000—），男，宁夏中卫人，硕士研究生，主要研究方向：计算机视觉、图形图像处理
杜晓飞（1999—），男，河南安阳人，硕士研究生，主要研究方向：计算机视觉、图形图像处理。
基金资助:
宁夏回族自治区重点研发计划项目(2023BDE03006)

Abstract

Abstract:

Aiming at the problems of insufficient interpretability of latent features， limited disentanglement capability， and poor identity consistency in the existing 3D face generation methods， a 3D face generation method based on Latent Feature Enhancement for Disentanglement （LFED） was proposed. Firstly， the hierarchical clustering technique was used to construct a vector discretization module， so as to promote the potential features to absorb prior knowledge and improve the disentanglement performance. Secondly， a positional attention module was designed to integrate location information of the potential features selectively through element-by-element summation operation， so as to ensure the identity consistency of generated faces. Finally， combining prior knowledge and position information， the maximum normalization technique was used to enhance the interpretability of potential features in face generation process. Experimental results demonstrate that the proposed method achieves an accuracy of 95.67% in the latent feature disentanglement metric — Variability Predictability （VP）. Compared with Swap Disentangled Variational Auto-Encoder （SD-VAE）， Local Eigenprojection Disentangled Variational Auto-Encoder （LED-VAE）， and Spherical Harmonic Local Eigenprojection Disentangled Variational Auto-Encoder （SHLED-VAE）， the improvements are 14.96， 14.33， and 12.46 percentage points， respectively. It can be seen that the proposed method enhances disentanglement performance while maintaining good representation and reconstruction capabilities.

Key words: 3D face generation, latent variable disentanglement, positional attention, vector discretization, feature fusion

摘要：

针对现有的三维人脸生成方法中潜在特征解释性不足、解耦能力有限以及身份一致性不佳等问题，提出一种基于潜在特征增强进行解耦的三维人脸生成方法（LFED）。首先，采用层次聚类技术构建向量离散化模块，以促进潜在特征对先验知识的吸收，提升解耦性能；其次，设计位置注意力模块，通过逐元素求和操作，选择性整合潜在特征的位置信息，确保生成人脸的身份一致性；最后，结合先验知识与位置信息，采用最大归一化技术，增强潜在特征在人脸生成过程中的可解释性。实验结果表明，所提方法在潜在特征解耦指标变异可预测性（VP）上的精度达到95.67%，与小批次特征交换解纠缠方法SD-VAE （Swap Disentangled Variational Auto-Encoder）、局部特征投影解纠缠方法LED-VAE （Local Eigenprojection Disentangled Variational Auto-Encoder）和球谐函数局部特征投影方法SHLED-VAE （Latent Feature Enhanced for Disentanglement Variational Auto-Encoder ）相比，分别提升了14.96、14.33和12.46个百分点。可见，所提方法在保持良好的表示与重建能力的同时，解耦性能有大幅提升。

关键词: 三维人脸生成, 潜变量解耦, 位置注意力, 向量离散化, 特征融合

CLC Number:

TP391.41

Jinyu LIANG, Hongjuan GAO, Xiaofei DU. 3D face generation method based on latent feature enhancement for disentanglement[J]. Journal of Computer Applications, 2026, 46(1): 216-223.

梁瑾裕, 高宏娟, 杜晓飞. 基于潜在特征增强进行解耦的三维人脸生成方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 216-223.

Figures/Tables 10

Fig. 1 Framework of proposed method

Tab. 1 Encoder layer input and output

层序号	层名称	输入大小	输出大小
1	螺旋卷积层^［30］	71 926 $×$ 3	71 926 $×$ 32
	实例归一化层	71 926 $×$ 32	71 926 $×$ 32
	非线性层	71 926 $×$ 32	71 926 $×$ 32
	下采样层	71 926 $×$ 32	17 982 $×$ 32
2	螺旋卷积层^［30］	17 982 $×$ 32	17 982 $×$ 32
	实例归一化层	17 982 $×$ 32	17 982 $×$ 32
	非线性层	17 982 $×$ 32	17 982 $×$ 32
	下采样层	17 982 $×$ 32	4 496 $×$ 32
3	螺旋卷积层^［30］	4 496 $×$ 32	4 496 $×$ 32
	实例归一化层	4 496 $×$ 32	4 496 $×$ 32
	非线性层	4 496 $×$ 32	4 496 $×$ 32
	下采样层	4 496 $×$ 32	1 124 $×$ 32
4	螺旋卷积层^［30］	1 124 $×$ 32	1 124 $×$ 32
	实例归一化层	1 124 $×$ 32	1 124 $×$ 32
	非线性层	1 124 $×$ 32	1 124 $×$ 32
	下采样层	1 124 $×$ 32	281 $×$ 64
5	螺旋卷积层^［30］	281 $×$ 64	281 $×$ 64
	实例归一化层	281 $×$ 64	281 $×$ 64
	非线性层	281 $×$ 64	281 $×$ 64
	下采样层	281 $×$ 64	60 $×$ 3
6	全连接	60 $×$ 3	60

Tab. 1 Encoder layer input and output

层序号	层名称	输入大小	输出大小
1	螺旋卷积层^［30］	71 926 $×$ 3	71 926 $×$ 32
	实例归一化层	71 926 $×$ 32	71 926 $×$ 32
	非线性层	71 926 $×$ 32	71 926 $×$ 32
	下采样层	71 926 $×$ 32	17 982 $×$ 32
2	螺旋卷积层^［30］	17 982 $×$ 32	17 982 $×$ 32
	实例归一化层	17 982 $×$ 32	17 982 $×$ 32
	非线性层	17 982 $×$ 32	17 982 $×$ 32
	下采样层	17 982 $×$ 32	4 496 $×$ 32
3	螺旋卷积层^［30］	4 496 $×$ 32	4 496 $×$ 32
	实例归一化层	4 496 $×$ 32	4 496 $×$ 32
	非线性层	4 496 $×$ 32	4 496 $×$ 32
	下采样层	4 496 $×$ 32	1 124 $×$ 32
4	螺旋卷积层^［30］	1 124 $×$ 32	1 124 $×$ 32
	实例归一化层	1 124 $×$ 32	1 124 $×$ 32
	非线性层	1 124 $×$ 32	1 124 $×$ 32
	下采样层	1 124 $×$ 32	281 $×$ 64
5	螺旋卷积层^［30］	281 $×$ 64	281 $×$ 64
	实例归一化层	281 $×$ 64	281 $×$ 64
	非线性层	281 $×$ 64	281 $×$ 64
	下采样层	281 $×$ 64	60 $×$ 3
6	全连接	60 $×$ 3	60

Tab. 2 Decoder layer input and output

层序号	层名称	输入大小	输出大小
1	全连接层	60	60 $×$ 3
2	螺旋卷积层^［30］	60 $×$ 3	281 $×$ 64
2	上采样层	281 $×$ 64	281 $×$ 64
3	螺旋卷积层^［30］	281 $×$ 64	1 124 $×$ 32
3	上采样层	1 124 $×$ 32	1 124 $×$ 32
4	螺旋卷积层^［30］	1 124 $×$ 32	4 496 $×$ 32
4	上采样层	4 496 $×$ 32	4 496 $×$ 32
5	螺旋卷积层^［30］	4 496 $×$ 32	17 982 $×$ 32
5	上采样层	17 982 $×$ 32	17 982 $×$ 32
6	螺旋卷积层^［30］	17 982 $×$ 32	71 926 $×$ 32
6	上采样层	71 926 $×$ 32	71 926 $×$ 3

Tab. 2 Decoder layer input and output

层序号	层名称	输入大小	输出大小
1	全连接层	60	60 $×$ 3
2	螺旋卷积层^［30］	60 $×$ 3	281 $×$ 64
2	上采样层	281 $×$ 64	281 $×$ 64
3	螺旋卷积层^［30］	281 $×$ 64	1 124 $×$ 32
3	上采样层	1 124 $×$ 32	1 124 $×$ 32
4	螺旋卷积层^［30］	1 124 $×$ 32	4 496 $×$ 32
4	上采样层	4 496 $×$ 32	4 496 $×$ 32
5	螺旋卷积层^［30］	4 496 $×$ 32	17 982 $×$ 32
5	上采样层	17 982 $×$ 32	17 982 $×$ 32
6	螺旋卷积层^［30］	17 982 $×$ 32	71 926 $×$ 32
6	上采样层	71 926 $×$ 32	71 926 $×$ 3

Fig. 2 Architecture of vector discretization module

Fig. 3 Architecture of positional attention module

Tab. 3 Comparison of experimental results from different methods

方法	Diversity （ $↑$ ）	JSD （ $↓$ ）	MMD （ $↓$ ）	COV/%（ $↑$ ）	VP/%（ $↑$ ）	训练时间/min（ $↓$ ）
β-VAE^［38］	3.93	6.31	1.06	59.07	65.92	106
DIP-VAE-I^［39］	3.12	11.15	0.93	52.82	50.60	108
LSGAN^［40］	5.92	2.52	1.54	43.95	71.39	374
WGAN^［41］	4.46	5.41	1.26	56.65	77.15	313
SD-VAE^［4］	4.13	4.59	1.50	64.60	80.71	441
LED-VAE^［5］	5.22	2.31	2.24	49.79	81.34	342
SHLED-VAE^［6］	5.13	3.32	1.70	46.69	83.21	288
LFED	5.26	2.34	1.84	47.58	95.67	388

Tab. 3 Comparison of experimental results from different methods

方法	Diversity （ $↑$ ）	JSD （ $↓$ ）	MMD （ $↓$ ）	COV/%（ $↑$ ）	VP/%（ $↑$ ）	训练时间/min（ $↓$ ）
β-VAE^［38］	3.93	6.31	1.06	59.07	65.92	106
DIP-VAE-I^［39］	3.12	11.15	0.93	52.82	50.60	108
LSGAN^［40］	5.92	2.52	1.54	43.95	71.39	374
WGAN^［41］	4.46	5.41	1.26	56.65	77.15	313
SD-VAE^［4］	4.13	4.59	1.50	64.60	80.71	441
LED-VAE^［5］	5.22	2.31	2.24	49.79	81.34	342
SHLED-VAE^［6］	5.13	3.32	1.70	46.69	83.21	288
LFED	5.26	2.34	1.84	47.58	95.67	388

Fig. 4 Effects of traversing each latents feature in different facial attributes

Fig. 5 Latent feature traversal effect diagrams of different methods

Tab. 4 Experimental results of each module in VP metric

模块	VP/%（ $↑$ ）
None	81.34
PA	84.50
VQ	83.84
VQ+PA	90.27
EC	87.77
VQ+PA+EC	95.67

Tab. 4 Experimental results of each module in VP metric

模块	VP/%（ $↑$ ）
None	81.34
PA	84.50
VQ	83.84
VQ+PA	90.27
EC	87.77
VQ+PA+EC	95.67

Fig. 6 Impact of each module on latent traversal

References 41

[1]	SILVA A G DA， MENDES GOMES M V， WINKLER I. Virtual reality and digital human modeling for ergonomic assessment in industrial product development： a patent and literature review ［J］. Applied Sciences， 2022， 12（3）： No.1084.
[2]	HE Q， LI L， LI D， et al. From digital human modeling to human digital twin： framework and perspectives in human factors ［J］. Chinese Journal of Mechanical Engineering， 2024， 37（1）： No.9.
[3]	YANG Y， ZHANG H， FERNÁNDEZ A B， et al. Digitalization of 3-D human bodies： a survey ［J］. IEEE Transactions on Consumer Electronics， 2024， 70（1）： 3152-3166.
[4]	FOTI S， KOO B， STOYANOV D， et al. 3D shape variational autoencoder latent disentanglement via mini-batch feature swapping for bodies and faces ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 18709-18718.
[5]	FOTI S， KOO B， STOYANOV D， et al. 3D generative model latent disentanglement via local eigenprojection ［J］. Computer Graphics Forum， 2023， 42（6）： No.e14793.
[6]	赵世杰，梁瑾裕，杜晓飞，等.球谐函数局部特征投影的三维人脸生成方法［J］.计算机辅助设计与图形学学报， 2025， 37（3）： 385-395.
	ZHAO S J， LIANG J Y， DU X F， et al. 3D face generation method based on local feature projection of spherical harmonic ［J］. Journal of Computer-Aided Design and Computer Graphics， 2025， 37（3）： 385-395.
[7]	HUANG Z. New face recognition technologies based on 3DMM ［C］// Proceedings of the SPIE 12153， International Conference on Artificial Intelligence， Virtual Reality， and Visualization. Bellingham， WA： SPIE， 2021： No.1215315.
[8]	ZHAO Y， CAO X， LIU S， et al. A facial expression transfer method based on 3DMM and diffusion models ［C］// Proceedings of the 2024 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2024： 3145-3149.
[9]	ZHANG H， REN Y， CHEN Y， et al. Exploiting multiple guidance from 3DMM for face reenactment ［EB/OL］. ［2024-04-28］. .
[10]	TRAN L， LIU X. Nonlinear 3D face morphable model ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7346-7355.
[11]	BOURITSAS G， BOKHNYAK S， PLOUMPIS S， et al. Neural 3D morphable models： spiral convolutional networks for 3D shape representation learning and generation ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 7212-7221.
[12]	BLANZ V， VETTER T. A morphable model for the synthesis of 3D faces ［J］. Seminal Graphics Papers： Pushing the Boundaries， 2023， 2： No.18.
[13]	WANG X， LU H， LIU X， et al. Dynamic coordination of miscible polymer blends towards highly designable shape memory effect ［J］. Polymer， 2020， 208： No.122946.
[14]	ALEXA M， COHEN-OR D， LEVIN D. As-rigid-as-possible shape interpolation ［C］// Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New York： ACM， 2000： 157-164.
[15]	ONIZUKA H， THOMAS D， UCHIYAMA H， et al. Landmark-guided deformation transfer of template facial expressions for automatic generation of avatar blend-shapes ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway： IEEE， 2019： 2100-2108.
[16]	RACKOVIĆ S， SOARES C， JAKOVETIĆ D， et al. Clustering of the blendshape facial model ［C］// Proceedings of the 29th European Signal Processing Conference. Piscataway： IEEE， 2021： 1556-1560.
[17]	MING X， LI J， LING J， et al. High-quality mesh blendshape generation from face videos via neural inverse rendering ［C］// Proceedings of the 2024 European Conference on Computer Vision， LNCS 15128. Cham： Springer， 2025： 106-125.
[18]	GECER B， PLOUMPIS S， KOTSIA I， et al. GANFIT： generative adversarial network fitting for high fidelity 3D face reconstruction ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 1155-1164.
[19]	MOSCHOGLOU S， PLOUMPIS S， NICOLAOU M A， et al. 3DFaceGAN： adversarial nets for 3D face representation， generation， and translation ［J］. International Journal of Computer Vision， 2020， 128（10/11）： 2534-2551.
[20]	LAN Y， MENG X， YANG S， et al. Self-supervised geometry-aware encoder for style-based 3D GAN inversion ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 20940-20949.
[21]	LIU Z， LI M， ZHANG Y， et al. Fine-grained face swapping via regional GAN inversion ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 8578-8587.
[22]	RAI A， GUPTA H， PANDEY A， et al. Towards realistic generative 3D face models ［C］// Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2024： 3726-3736.
[23]	SONG X， FENG X， ZHU L， et al. SNP site-drug association prediction algorithm based on denoising variational auto-encoder ［J］. Journal of Measurement Science and Instrumentation， 2022， 13（3）： 300-308.
[24]	KIM S U， ROH J， IM H， et al. Anisotropic SpiralNet for 3D shape completion and denoising ［J］. Sensors， 2022， 22（17）： No.6457.
[25]	DEY R， BODDETI V N. Generating diverse 3D reconstructions from a single occluded face image ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 1537-1547.
[26]	ZHANG W， CUNX， WANG X， et al. SadTalker： learning realistic 3D motion coefficients for stylized audio-driven single image talking face animation ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 8652-8661.
[27]	TAN Q， ZHANG L X， YANG J， et al. Variational autoencoders for localized mesh deformation component analysis ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（10）： 6297-6310.
[28]	YANG J， MO K， LAI Y K， et al. DSG-Net： learning disentangled structure and geometry for 3D shape generation ［J］. ACM Transactions on Graphics， 2023， 42（1）： No.1.
[29]	LI G， YANG H， HUANG D， et al. 3D face modeling via weakly-supervised disentanglement network joint identity-consistency prior ［C］// Proceedings of the IEEE 18th International Conference on Automatic Face and Gesture Recognition. Piscataway： IEEE， 2024： 1-10.
[30]	GONG S， CHEN L， BRONSTEIN M， et al. Spiralnet++： a fast and highly efficient mesh convolution operator ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway： IEEE， 2019： 4141-4148.
[31]	REDEKOP E， PLEASURE M， WANG Z， et al. Codebook VQ-VAE approach for prostate cancer diagnosis using multiparametric MRI ［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2024： 2365-2372.
[32]	NAIK， PARVAIZ AHMAD， ZOHREH ESKANDARI. Nonlinear dynamics of a three-dimensional discrete time delay neural network ［J］. International Journal of Biomathematics， 2024， 17（6）： No.2350057.
[33]	FU J， LIU J， TIAN H， et al. Dual attention network for scene segmentation ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 3141-3149.
[34]	YU X， TANG L， RAO Y， et al. Point-BERT： pre-training 3D point cloud transformers with masked point modeling ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 19291-19300.
[35]	PLOUMPIS S， WANG H， PEARS N， et al. Combining 3D morphable models： a large scale face-and-head model ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 10926-10935.
[36]	ACHLIOPTAS P， DIAMANTI O， MITLIAGKAS I， et al. Learning representations and generative models for 3D point clouds ［C］// Proceedings of the 35th International Conference on Machine Learning. New York： JMLR.org， 2018： 40-49.
[37]	ZHU X， XU C， TAO D. Learning disentangled representations with latent variation predictability ［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12355. Cham： Springer， 2020： 684-700.
[38]	HIGGINS I， MATTHEY L， PAL A， et al. β-VAE： learning basic visual concepts with a constrained variational framework ［EB/OL］. ［2024-04-28］. .
[39]	KUMAR A， SATTIGERI P， BALAKRISHNAN A. Variational inference of disentangled latent concepts from unlabeled observations ［EB/OL］. ［2024-04-28］. .
[40]	MAO X， LI Q， XIE H， et al. Least squares generative adversarial networks ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2813-2821.
[41]	ARJOVSKY M， CHINTALA S， BOTTOU L. Wasserstein generative adversarial networks ［C］// Proceedings of the 34th International Conference on Machine Learning. New York： JMLR.org， 2017： 214-223.

3D face generation method based on latent feature enhancement for disentanglement

基于潜在特征增强进行解耦的三维人脸生成方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 10

References 41

Related Articles 15

Recommended Articles

Metrics

[1]	Zhihui ZAN, Yajing WANG, Ke LI, Zhixiang YANG, Guangyu YANG. Multi-feature fusion speech emotion recognition method based on SAA-CNN-BiLSTM network [J]. Journal of Computer Applications, 2026, 46(1): 69-76.
[2]	Ning CAO, Xin WEN, Yanrong HAO, Rui CAO. Lightweight motor imagery electroencephalogram decoding neural network with multi-domain feature fusion [J]. Journal of Computer Applications, 2026, 46(1): 289-296.
[3]	Yiming LIANG, Jing FAN, Wenze CHAI. Multi-scale feature fusion sentiment classification based on bidirectional cross attention [J]. Journal of Computer Applications, 2025, 45(9): 2773-2782.
[4]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[5]	Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016.
[6]	Fang WANG, Jing HU, Rui ZHANG, Wenting FAN. Medical image segmentation network with content-guided multi-angle feature fusion [J]. Journal of Computer Applications, 2025, 45(9): 3017-3025.
[7]	Yimeng XI, Zhen DENG, Qian LIU, Libo LIU. Cross-modal information fusion for video-text retrieval [J]. Journal of Computer Applications, 2025, 45(8): 2448-2456.
[8]	Chengzhi YAN, Ying CHEN, Kai ZHONG, Han GAO. 3D object detection algorithm based on multi-scale network and axial attention [J]. Journal of Computer Applications, 2025, 45(8): 2537-2545.
[9]	Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.
[10]	Liang CHEN, Xuan WANG, Kun LEI. Helmet wearing detection algorithm for complex scenarios based on cross-layer multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(7): 2333-2341.
[11]	Xiang WANG, Qianqian CUI, Xiaoming ZHANG, Jianchao WANG, Zhenzhou WANG, Jialin SONG. Wireless capsule endoscopy image classification model based on improved ConvNeXt [J]. Journal of Computer Applications, 2025, 45(6): 2016-2024.
[12]	Zonghang WU, Dong ZHANG, Guanyu LI. Multimodal fusion recommendation algorithm based on joint self-supervised learning [J]. Journal of Computer Applications, 2025, 45(6): 1858-1868.
[13]	Linjia SUN, Lei QIN, Meijin KANG, Yinglin WANG. Automatic speech segmentation algorithm based on syllable type recognition [J]. Journal of Computer Applications, 2025, 45(6): 2034-2042.
[14]	Ying HUANG, Shengmei GAO, Guang CHEN, Su LIU. Low-light image enhancement network combining signal-to-noise ratio guided dual-branch structure and histogram equalization [J]. Journal of Computer Applications, 2025, 45(6): 1971-1979.
[15]	Yali YANG, Ying LI, Yutao ZHANG, Peihua SONG. Review of multi-modal research methods for face recognition [J]. Journal of Computer Applications, 2025, 45(5): 1645-1657.