基于动态上采样的轻量级生成对抗网络DU-FastGAN

doi:10.11772/j.issn.1001-9081.2024101535

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3067-3073.DOI: 10.11772/j.issn.1001-9081.2024101535

• 人工智能 •

基于动态上采样的轻量级生成对抗网络DU-FastGAN

徐国愚, 闫晓龙(), 张一丹

河南财经政法大学计算机与信息工程学院，郑州 450046

收稿日期:2024-10-31 修回日期:2024-12-12 接受日期:2024-12-20 发布日期:2025-03-18 出版日期:2025-10-10
通讯作者: 闫晓龙
作者简介:徐国愚（1982—），男，安徽庐江人，副教授，博士，CCF高级会员，主要研究方向：深度学习
闫晓龙（2000—），男，河南郑州人，硕士研究生，CCF会员，主要研究方向：深度学习
张一丹（2001—），女，河南郑州人，硕士研究生，CCF会员，主要研究方向：深度学习。
基金资助:
国家自然科学基金资助项目(61602153)

DU-FastGAN： lightweight generative adversarial network based on dynamic-upsample

Guoyu XU, Xiaolong YAN(), Yidan ZHANG

School of Computer and Information Engineering，Henan University of Economics and Law，Zhengzhou Henan 450046，China

Received:2024-10-31 Revised:2024-12-12 Accepted:2024-12-20 Online:2025-03-18 Published:2025-10-10
Contact: Xiaolong YAN
About author:XU Guoyu， born in 1982， Ph. D.， associate professor. His research interests include deep learning.
YAN Xiaolong， born in 2000， M. S. candidate. His research interests include deep learning.
ZHANG Yidan， born in 2001， M. S. candidate. Her research interests include deep learning.
Supported by:
National Natural Science Foundation of China(61602153)

摘要/Abstract

摘要：

近年来，生成对抗网络（GAN）被广泛应用于数据增强，能有效缓解训练样本不足的问题，对模型训练具有重要研究意义。然而，现有用于数据增强的GAN模型存在对数据集要求高和模型收敛不稳定等问题，导致生成的图像易出现失真和形变。因此，提出一种基于动态上采样的轻量级GAN——DU-FastGAN（Dynamic-Upsample-FastGAN）进行数据增强。首先，通过动态上采样模块构建生成器，使生成器能够根据当前特征图的大小采用不同粒度的上采样方法，从而重建纹理，提高合成的整体结构和局部细节的质量；其次，为了使模型能够更好地获取图像的全局信息流，提出权重信息跳跃连接模块，以减小卷积及池化操作对特征的扰动，提高模型对不同特征的学习能力，使得模型生成图像的细节更逼真；最后，给出特征丢失损失函数，通过计算采样过程中对应特征图之间的相对距离提高模型生成质量。实验结果表明，相较于FastGAN、MixDL（Mixup-based Distance Learning）和RCL-master（Reverse Contrastive Learning-master）等方法，DU-FastGAN在10个小数据集上的FID（Fréchet Inception Distance）的最大降幅达到23.47%，能够有效缓解生成图像的失真和形变问题，并提高了生成图像的质量；同时，DU-FastGAN的模型训练时间在600 min内，实现了轻量级开销。

关键词: 生成对抗网络, 数据增强, 动态上采样, 权重信息跳跃连接, 特征丢失损失

Abstract:

In recent years， Generative Adversarial Networks （GANs） have been widely used for data augmentation， which can solve the problem of insufficient training samples effectively and has important research significance for model training. However， the existing GAN models for data augmentation have problems such as high requirements for datasets and unstable model convergence， which can lead to distortion and deformation of the generated images. Therefore， a lightweight GAN based on dynamic-upsample — DU-FastGAN （Dynamic-Upsample-FastGAN） was proposed for data augmentation. Firstly， a generator was constructed through a dynamic-upsample module， which enables the generator to use upsampling methods of different granularities based on the size of the current feature map， thereby reconstructing textures， and enhancing overall structure and local detail quality of the synthesis. Secondly， in order to enable the model to better obtain global information flow of images， a weight information skip connection module was proposed to reduce the disturbance of convolution and pooling operations on features， thereby improving the model’s learning ability for different features， and making details of the generated images more realistic. Finally， a feature loss function was given to improve the quality of the model generation by calculating relative distance between the corresponding feature maps during the sampling process. Experimental results show that compared with methods such as FastGAN， MixDL （Mixup-based Distance Learning）， and RCL-master （Reverse Contrastive Learning-master）， DU-FastGAN achieves a maximum reduction of 23.47% in FID （Fréchet Inception Distance） on 10 small datasets， thereby reducing distortion and deformation problems in the generated images effectively， and improving the quality of the generated images. At the same time， DU-FastGAN achieves lightweight overhead with model training time within 600 min.

Key words: Generative Adversarial Network (GAN), data augmentation, dynamic-upsample, weight information skip connection, feature loss function

中图分类号:

TP391

徐国愚, 闫晓龙, 张一丹. 基于动态上采样的轻量级生成对抗网络DU-FastGAN[J]. 计算机应用, 2025, 45(10): 3067-3073.

Guoyu XU, Xiaolong YAN, Yidan ZHANG. DU-FastGAN： lightweight generative adversarial network based on dynamic-upsample[J]. Journal of Computer Applications, 2025, 45(10): 3067-3073.

图/表 11

参考文献 23

[1]	PORKODI S P， SARADA V， MAIK V， et al. Generic image application using GANs （Generative Adversarial Networks）： a review［J］. Evolving Systems， 2023， 14（5）： 903-917.
[2]	杨玮，钟名锋，杨根，等. 基于NVAE和OB-Mix的小样本数据增强方法［J］. 计算机工程与应用， 2024， 60（2）：103-112.
	YANG W， ZHONG M F， YANG G， et al. Few samples data augmentation method based on NVAE and OB-Mix［J］. Computer Engineering and Applications， 2024， 60（2）：103-112.
[3]	WEI J， WANG Q， ZHAO Z. Generative adversarial network based on Poincaré distance similarity constraint： focusing on overfitting problem caused by finite training data［J］. Applied Soft Computing， 2024， 151： No.111147.
[4]	SNIDER E J， HERNANDEZ-TORRES S I， HENNESSEY R. Using ultrasound image augmentation and ensemble predictions to prevent machine-learning model overfitting［J］. Diagnostics， 2023， 13（3）： No.417.
[5]	NAVEED H， ANWAR S， HAYAT M， et al. Survey： image mixing and deleting for data augmentation［J］. Engineering Applications of Artificial Intelligence， 2024， 131： No.107791.
[6]	姜文涛，刘玉薇，张晟翀. 随机通道扰动的图像数据增强方法［J］. 计算机科学与探索， 2024， 18（11）： 2980-2995.
	JIANG W T， LIU Y W， ZHANG S C. Image data augmentation method for random channel perturbation［J］. Journal of Frontiers of Computer Science and Technology， 2024， 18（11）： 2980-2995.
[7]	AGGARWAL A， MITTAL M， BATTINENI G. Generative adversarial network： an overview of theory and applications［J］. International Journal of Information Management Data Insights， 2021， 1（1）： No.100004.
[8]	BISWAS A， NASIM M A AL， IMRAN A， et al. Generative adversarial networks for data augmentation［M］// ZHENG B， ANDREI S， SARKER M K， et al. Data driven approaches on medical imaging. Cham： Springer， 2023： 159-177.
[9]	MIRZA M， OSINDERO S. Conditional generative adversarial nets［EB/OL］. ［2024-07-03］..
[10]	ARJOVSKY M， CHINTALA S， BOTTOU L. Wasserstein generative adversarial networks［C］// Proceedings of the 34th International Conference on Machine Learning. New York： JMLR.org， 2017： 214-223.
[11]	DONAHUE J， SIMONYAN K. Large scale adversarial representation learning［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2019： 10542-10552.
[12]	KARRAS T， AILA T， LAINE S， et al. Progressive growing of GANs for improved quality， stability， and variation［EB/OL］. ［2024-05-02］..
[13]	BHATTAD A， McKEE D， HOIEM D， et al. StyleGAN knows normal， depth， albedo， and more［C］// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2023： 73082-73103.
[14]	LIU B， ZHU Y， SONG K， et al. Towards faster and stabilized GAN training for high-fidelity few-shot image synthesis［EB/OL］. ［2024-05-02］..
[15]	KONG C， KIM J， HAN D， et al. Few-shot image generation with mixup-based distance learning［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13675. Cham： Springer， 2022： 563-580.
[16]	GOU Y， LI M， ZHANG Y， et al. Few-shot image generation with reverse contrastive learning［J］. Neural Networks， 2024， 169： 154-164.
[17]	HU H， LIU Z， LI L， et al. Pixel-wise smoothing for certified robustness against camera motion perturbations［C］// Proceedings of the 27th International Conference on Artificial Intelligence and Statistics. New York： JMLR.org， 2024： 217-225.
[18]	ALLAHYANI M， ALSULAMI R， ALWAFI T， et al. DivGAN： a diversity enforcing generative adversarial network for mode collapse reduction［J］. Artificial Intelligence， 2023， 317： No.103863.
[19]	DENG J， WANG S， YE J， et al. DGRM： diffusion-GAN recommendation model to alleviate the mode collapse problem in sparse environments［J］. Pattern Recognition， 2024， 155： No.110692.
[20]	DESALEGN L， JIFARA W. HARA-GAN： hybrid attention and relative average discriminator based generative adversarial network for MR image reconstruction［J］. IEEE Access， 2024， 12： 23240-23251.
[21]	ROY A， DASGUPTA D. DRD-GAN： a novel distributed conditional Wasserstein deep convolutional relativistic discriminator GAN with improved convergence［J］. ACM Transactions on Probabilistic Machine Learning， 2025， 1（1）： No.6.
[22]	YU Y， ZHANG W， DENG Y. Fréchet Inception Distance （FID） for evaluating GANs［EB/OL］. ［2024-04-23］..
[23]	GHAZANFARI S， GARG S， KRISHNAMURTHY P， et al. R-LPIPS： an adversarially robust perceptual similarity metric［EB/OL］. ［2024-08-11］..

分辨率	子数据集	样本数
256×256	Dog	389
	Cat	160
	Human Face	100
	Panda	101
1 024×1 024	Pokemon	833
	Art-Painting	124
	Fauvism	134
	Moongate	137
	Shells	97
	Skulls	64

分辨率	子数据集	样本数
256×256	Dog	389
	Cat	160
	Human Face	100
	Panda	101
1 024×1 024	Pokemon	833
	Art-Painting	124
	Fauvism	134
	Moongate	137
	Shells	97
	Skulls	64

模型	Epoch/10⁴	Dog		Cat		Human Face		Panda
模型	Epoch/10⁴	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
FastGAN	6	101.81	0.635 5	129.74	0.605 4	50.55	0.532 7	11.32	0.522 2
MixDL	12	155.61	0.603 3	147.93	0.587 0	61.57	0.503 7	13.99	0.479 6
RCL-master	20	99.26	0.637 1	131.94	0.610 3	51.48	0.561 3	11.07	0.528 3
DU-FastGAN	6	92.29	0.647 1	112.17	0.617 8	48.71	0.594 5	10.40	0.540 5

模型	Epoch/10⁴	Dog		Cat		Human Face		Panda
模型	Epoch/10⁴	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
FastGAN	6	101.81	0.635 5	129.74	0.605 4	50.55	0.532 7	11.32	0.522 2
MixDL	12	155.61	0.603 3	147.93	0.587 0	61.57	0.503 7	13.99	0.479 6
RCL-master	20	99.26	0.637 1	131.94	0.610 3	51.48	0.561 3	11.07	0.528 3
DU-FastGAN	6	92.29	0.647 1	112.17	0.617 8	48.71	0.594 5	10.40	0.540 5

模型	Epoch/10⁴	Pokemon		Art-Painting		Fauvism		Moongate		Shells		Skulls
模型	Epoch/10⁴	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
FastGAN	6	75.77	0.567	51.58	0.746	194.11	0.775	139.94	0.709	296.84	0.433	202.59	0.620
MixDL	12	98.42	0.514	63.33	0.720	204.01	0.721	177.29	0.675	304.91	0.401	253.20	0.607
RCL-master	20	70.13	0.633	49.67	0.732	189.16	0.741	140.84	0.697	229.27	0.458	190.57	0.640
DU-FastGAN	6	75.69	0.607	51.36	0.744	185.33	0.800	134.87	0.712	227.17	0.464	160.69	0.667

基于动态上采样的轻量级生成对抗网络DU-FastGAN

DU-FastGAN： lightweight generative adversarial network based on dynamic-upsample

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 23

相关文章 15

编辑推荐

Metrics

模型	Dog		Skulls		Human Face
模型	训练	生成	训练	生成	训练	生成
FastGAN	498	0.40	503	0.22	505	0.12
MixDL	727	0.30	723	0.30	719	0.27
RCL-master	13 192	0.33	13 217	0.27	13 199	0.27
DU-FastGAN	599	0.18	593	0.30	597	0.18

[1]	李莉, 宋涵, 刘培鹤, 陈汉林. 基于数据增强和残差网络的敏感信息命名实体识别[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2790-2797.
[2]	邓伊琳, 余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2893-2901.
[3]	梁一鸣, 范菁, 柴汶泽. 基于双向交叉注意力的多尺度特征融合情感分类[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2773-2782.
[4]	王闯, 俞璐, 陈健威, 潘成, 杜文博. 开集域适应综述[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2727-2736.
[5]	周金, 李玉芝, 张徐, 高硕, 张立, 盛家川. 复杂电磁环境下的调制识别网络[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2672-2682.
[6]	姜超英, 李倩, 刘宁, 刘磊, 崔立真. 基于图对比学习的再入院预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1784-1792.
[7]	李道全, 徐正, 陈思慧, 刘嘉宇. 融合变分自编码器与自适应增强卷积神经网络的网络流量分类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1841-1848.
[8]	黄颖, 高胜美, 陈广, 刘苏. 结合信噪比引导的双分支结构和直方图均衡的低照度图像增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1971-1979.
[9]	李慧, 贾炳志, 王晨曦, 董子宇, 李纪龙, 仲兆满, 陈艳艳. 基于Swin Transformer的生成对抗网络水下图像增强模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1439-1446.
[10]	李雪莹, 杨琨, 涂国庆, 刘树波. 基于局部增强的时序数据对抗样本生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1573-1581.
[11]	潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1300-1309.
[12]	田仁杰, 景明利, 焦龙, 王飞. 基于混合负采样的图对比学习推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1053-1060.
[13]	孙海涛, 林佳瑜, 梁祖红, 郭洁. 结合标签混淆的中文文本分类数据增强技术[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1113-1119.
[14]	孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807.
[15]	盛坤, 王中卿. 基于大语言模型和数据增强的通感隐喻分析[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 794-800.

模型	Dog		Skulls		Human Face		Panda
模型	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
Baseline	184.11	0.635 5	203.44	0.619 9	50.55	0.532 7	11.32	0.522 2
F1	87.71	0.640 6	169.66	0.621 3	49.43	0.564 4	11.07	0.536 7
F2	113.76	0.635 5	181.03	0.632 7	48.92	0.573 2	10.96	0.529 1
F3	107.63	0.631 5	172.39	0.651 7	49.11	0.570 3	11.14	0.535 8
F4	83.69	0.641 2	163.37	0.638 4	49.27	0.578 2	10.90	0.537 3
F5	97.68	0.637 5	171.51	0.645 2	48.86	0.579 5	10.74	0.531 6
F6	90.04	0.645 1	162.94	0.659 3	48.91	0.586 7	10.97	0.538 1
DU-FastGAN	82.64	0.647 1	160.69	0.666 6	48.71	0.594 5	10.40	0.540 5

模型	Dog		Skulls		Human Face		Panda
模型	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
Baseline	184.11	0.635 5	203.44	0.619 9	50.55	0.532 7	11.32	0.522 2
F1	87.71	0.640 6	169.66	0.621 3	49.43	0.564 4	11.07	0.536 7
F2	113.76	0.635 5	181.03	0.632 7	48.92	0.573 2	10.96	0.529 1
F3	107.63	0.631 5	172.39	0.651 7	49.11	0.570 3	11.14	0.535 8
F4	83.69	0.641 2	163.37	0.638 4	49.27	0.578 2	10.90	0.537 3
F5	97.68	0.637 5	171.51	0.645 2	48.86	0.579 5	10.74	0.531 6
F6	90.04	0.645 1	162.94	0.659 3	48.91	0.586 7	10.97	0.538 1
DU-FastGAN	82.64	0.647 1	160.69	0.666 6	48.71	0.594 5	10.40	0.540 5