DU-FastGAN： lightweight generative adversarial network based on dynamic-upsample

doi:10.11772/j.issn.1001-9081.2024101535

Abstract

Abstract:

In recent years， Generative Adversarial Networks （GANs） have been widely used for data augmentation， which can solve the problem of insufficient training samples effectively and has important research significance for model training. However， the existing GAN models for data augmentation have problems such as high requirements for datasets and unstable model convergence， which can lead to distortion and deformation of the generated images. Therefore， a lightweight GAN based on dynamic-upsample — DU-FastGAN （Dynamic-Upsample-FastGAN） was proposed for data augmentation. Firstly， a generator was constructed through a dynamic-upsample module， which enables the generator to use upsampling methods of different granularities based on the size of the current feature map， thereby reconstructing textures， and enhancing overall structure and local detail quality of the synthesis. Secondly， in order to enable the model to better obtain global information flow of images， a weight information skip connection module was proposed to reduce the disturbance of convolution and pooling operations on features， thereby improving the model’s learning ability for different features， and making details of the generated images more realistic. Finally， a feature loss function was given to improve the quality of the model generation by calculating relative distance between the corresponding feature maps during the sampling process. Experimental results show that compared with methods such as FastGAN， MixDL （Mixup-based Distance Learning）， and RCL-master （Reverse Contrastive Learning-master）， DU-FastGAN achieves a maximum reduction of 23.47% in FID （Fréchet Inception Distance） on 10 small datasets， thereby reducing distortion and deformation problems in the generated images effectively， and improving the quality of the generated images. At the same time， DU-FastGAN achieves lightweight overhead with model training time within 600 min.

Key words: Generative Adversarial Network (GAN), data augmentation, dynamic-upsample, weight information skip connection, feature loss function

摘要：

近年来，生成对抗网络（GAN）被广泛应用于数据增强，能有效缓解训练样本不足的问题，对模型训练具有重要研究意义。然而，现有用于数据增强的GAN模型存在对数据集要求高和模型收敛不稳定等问题，导致生成的图像易出现失真和形变。因此，提出一种基于动态上采样的轻量级GAN——DU-FastGAN（Dynamic-Upsample-FastGAN）进行数据增强。首先，通过动态上采样模块构建生成器，使生成器能够根据当前特征图的大小采用不同粒度的上采样方法，从而重建纹理，提高合成的整体结构和局部细节的质量；其次，为了使模型能够更好地获取图像的全局信息流，提出权重信息跳跃连接模块，以减小卷积及池化操作对特征的扰动，提高模型对不同特征的学习能力，使得模型生成图像的细节更逼真；最后，给出特征丢失损失函数，通过计算采样过程中对应特征图之间的相对距离提高模型生成质量。实验结果表明，相较于FastGAN、MixDL（Mixup-based Distance Learning）和RCL-master（Reverse Contrastive Learning-master）等方法，DU-FastGAN在10个小数据集上的FID（Fréchet Inception Distance）的最大降幅达到23.47%，能够有效缓解生成图像的失真和形变问题，并提高了生成图像的质量；同时，DU-FastGAN的模型训练时间在600 min内，实现了轻量级开销。

关键词: 生成对抗网络, 数据增强, 动态上采样, 权重信息跳跃连接, 特征丢失损失

CLC Number:

TP391

Guoyu XU, Xiaolong YAN, Yidan ZHANG. DU-FastGAN： lightweight generative adversarial network based on dynamic-upsample[J]. Journal of Computer Applications, 2025, 45(10): 3067-3073.

徐国愚, 闫晓龙, 张一丹. 基于动态上采样的轻量级生成对抗网络DU-FastGAN[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3067-3073.

Figures/Tables 11

References 23

[1]	PORKODI S P， SARADA V， MAIK V， et al. Generic image application using GANs （Generative Adversarial Networks）： a review［J］. Evolving Systems， 2023， 14（5）： 903-917.
[2]	杨玮，钟名锋，杨根，等. 基于NVAE和OB-Mix的小样本数据增强方法［J］. 计算机工程与应用， 2024， 60（2）：103-112.
	YANG W， ZHONG M F， YANG G， et al. Few samples data augmentation method based on NVAE and OB-Mix［J］. Computer Engineering and Applications， 2024， 60（2）：103-112.
[3]	WEI J， WANG Q， ZHAO Z. Generative adversarial network based on Poincaré distance similarity constraint： focusing on overfitting problem caused by finite training data［J］. Applied Soft Computing， 2024， 151： No.111147.
[4]	SNIDER E J， HERNANDEZ-TORRES S I， HENNESSEY R. Using ultrasound image augmentation and ensemble predictions to prevent machine-learning model overfitting［J］. Diagnostics， 2023， 13（3）： No.417.
[5]	NAVEED H， ANWAR S， HAYAT M， et al. Survey： image mixing and deleting for data augmentation［J］. Engineering Applications of Artificial Intelligence， 2024， 131： No.107791.
[6]	姜文涛，刘玉薇，张晟翀. 随机通道扰动的图像数据增强方法［J］. 计算机科学与探索， 2024， 18（11）： 2980-2995.
	JIANG W T， LIU Y W， ZHANG S C. Image data augmentation method for random channel perturbation［J］. Journal of Frontiers of Computer Science and Technology， 2024， 18（11）： 2980-2995.
[7]	AGGARWAL A， MITTAL M， BATTINENI G. Generative adversarial network： an overview of theory and applications［J］. International Journal of Information Management Data Insights， 2021， 1（1）： No.100004.
[8]	BISWAS A， NASIM M A AL， IMRAN A， et al. Generative adversarial networks for data augmentation［M］// ZHENG B， ANDREI S， SARKER M K， et al. Data driven approaches on medical imaging. Cham： Springer， 2023： 159-177.
[9]	MIRZA M， OSINDERO S. Conditional generative adversarial nets［EB/OL］. ［2024-07-03］..
[10]	ARJOVSKY M， CHINTALA S， BOTTOU L. Wasserstein generative adversarial networks［C］// Proceedings of the 34th International Conference on Machine Learning. New York： JMLR.org， 2017： 214-223.
[11]	DONAHUE J， SIMONYAN K. Large scale adversarial representation learning［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2019： 10542-10552.
[12]	KARRAS T， AILA T， LAINE S， et al. Progressive growing of GANs for improved quality， stability， and variation［EB/OL］. ［2024-05-02］..
[13]	BHATTAD A， McKEE D， HOIEM D， et al. StyleGAN knows normal， depth， albedo， and more［C］// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2023： 73082-73103.
[14]	LIU B， ZHU Y， SONG K， et al. Towards faster and stabilized GAN training for high-fidelity few-shot image synthesis［EB/OL］. ［2024-05-02］..
[15]	KONG C， KIM J， HAN D， et al. Few-shot image generation with mixup-based distance learning［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13675. Cham： Springer， 2022： 563-580.
[16]	GOU Y， LI M， ZHANG Y， et al. Few-shot image generation with reverse contrastive learning［J］. Neural Networks， 2024， 169： 154-164.
[17]	HU H， LIU Z， LI L， et al. Pixel-wise smoothing for certified robustness against camera motion perturbations［C］// Proceedings of the 27th International Conference on Artificial Intelligence and Statistics. New York： JMLR.org， 2024： 217-225.
[18]	ALLAHYANI M， ALSULAMI R， ALWAFI T， et al. DivGAN： a diversity enforcing generative adversarial network for mode collapse reduction［J］. Artificial Intelligence， 2023， 317： No.103863.
[19]	DENG J， WANG S， YE J， et al. DGRM： diffusion-GAN recommendation model to alleviate the mode collapse problem in sparse environments［J］. Pattern Recognition， 2024， 155： No.110692.
[20]	DESALEGN L， JIFARA W. HARA-GAN： hybrid attention and relative average discriminator based generative adversarial network for MR image reconstruction［J］. IEEE Access， 2024， 12： 23240-23251.
[21]	ROY A， DASGUPTA D. DRD-GAN： a novel distributed conditional Wasserstein deep convolutional relativistic discriminator GAN with improved convergence［J］. ACM Transactions on Probabilistic Machine Learning， 2025， 1（1）： No.6.
[22]	YU Y， ZHANG W， DENG Y. Fréchet Inception Distance （FID） for evaluating GANs［EB/OL］. ［2024-04-23］..
[23]	GHAZANFARI S， GARG S， KRISHNAMURTHY P， et al. R-LPIPS： an adversarially robust perceptual similarity metric［EB/OL］. ［2024-08-11］..

分辨率	子数据集	样本数
256×256	Dog	389
	Cat	160
	Human Face	100
	Panda	101
1 024×1 024	Pokemon	833
	Art-Painting	124
	Fauvism	134
	Moongate	137
	Shells	97
	Skulls	64

分辨率	子数据集	样本数
256×256	Dog	389
	Cat	160
	Human Face	100
	Panda	101
1 024×1 024	Pokemon	833
	Art-Painting	124
	Fauvism	134
	Moongate	137
	Shells	97
	Skulls	64

模型	Epoch/10⁴	Dog		Cat		Human Face		Panda
模型	Epoch/10⁴	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
FastGAN	6	101.81	0.635 5	129.74	0.605 4	50.55	0.532 7	11.32	0.522 2
MixDL	12	155.61	0.603 3	147.93	0.587 0	61.57	0.503 7	13.99	0.479 6
RCL-master	20	99.26	0.637 1	131.94	0.610 3	51.48	0.561 3	11.07	0.528 3
DU-FastGAN	6	92.29	0.647 1	112.17	0.617 8	48.71	0.594 5	10.40	0.540 5

模型	Epoch/10⁴	Dog		Cat		Human Face		Panda
模型	Epoch/10⁴	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
FastGAN	6	101.81	0.635 5	129.74	0.605 4	50.55	0.532 7	11.32	0.522 2
MixDL	12	155.61	0.603 3	147.93	0.587 0	61.57	0.503 7	13.99	0.479 6
RCL-master	20	99.26	0.637 1	131.94	0.610 3	51.48	0.561 3	11.07	0.528 3
DU-FastGAN	6	92.29	0.647 1	112.17	0.617 8	48.71	0.594 5	10.40	0.540 5

模型	Epoch/10⁴	Pokemon		Art-Painting		Fauvism		Moongate		Shells		Skulls
模型	Epoch/10⁴	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS	FID	LPIPS
FastGAN	6	75.77	0.567	51.58	0.746	194.11	0.775	139.94	0.709	296.84	0.433	202.59	0.620
MixDL	12	98.42	0.514	63.33	0.720	204.01	0.721	177.29	0.675	304.91	0.401	253.20	0.607
RCL-master	20	70.13	0.633	49.67	0.732	189.16	0.741	140.84	0.697	229.27	0.458	190.57	0.640
DU-FastGAN	6	75.69	0.607	51.36	0.744	185.33	0.800	134.87	0.712	227.17	0.464	160.69	0.667