Controllable face editing algorithm with closed-form solution

doi:10.11772/j.issn.1001-9081.2022010030

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (2): 601-607.DOI: 10.11772/j.issn.1001-9081.2022010030

• Multimedia computing and computer simulation • Previous Articles

Controllable face editing algorithm with closed-form solution

Lingling TAO¹^,², Bo LIU¹^,²(), Wenbo LI¹^,², Xiping HE¹^,²

^1.School of Artificial Intelligence，Chongqing Technology and Business University，Chongqing 400067，China
^2.Chongqing Key Laboratory of Intelligent Perception and BlockChain Technology （Chongqing Technology and Business University），Chongqing 400067，China

Received:2022-01-11 Revised:2022-04-05 Accepted:2022-04-11 Online:2022-04-28 Published:2023-02-10
Contact: Bo LIU
About author:TAO Lingling， born in 1998， M. S. candidate. Her research interests include computer vision， image processing， generative adversarial network.
LI Wenbo， born in 1998， M. S. candidate. His research interests include machine learning， computer vision， image generation.
HE Xiping， born in 1968， Ph. D.， professor. His research interests include machine learning， data analysis and processing， computer vision.
Supported by:
Key Platform Fund of Chongqing Technology and Business University(950119093);Graduate Innovation Project of Chongqing Technology and Business University(yjscxx2021-112-99)

有闭解的可控人脸编辑算法

陶玲玲¹^,², 刘波¹^,²(), 李文博¹^,², 何希平¹^,²

^1.重庆工商大学人工智能学院，重庆 400067
^2.智能感知与区块链技术重庆市重点实验室（重庆工商大学），重庆 400067

通讯作者: 刘波
作者简介:陶玲玲（1998—），女，重庆人，硕士研究生，主要研究方向：计算机视觉、图像处理、生成对抗网络
李文博（1998—），男，重庆人，硕士研究生，主要研究方向：机器学习、计算机视觉、图像生成
何希平（1968—），男，重庆人，教授，博士，主要研究方向：机器学习、数据分析处理、计算机视觉。
基金资助:
重庆工商大学重点平台基金资助项目(950119093);重庆工商大学研究生“创新型科研项目”(yjscxx2021?112?99)

Abstract

Abstract:

To solve the problems in face editing， such as unnatural editing results and great changes in generated images， a controllable face editing algorithm with closed-form solution was proposed. Firstly， n latent vectors were sampled randomly to construct a sample matrix， and the top k principal component vectors of the matrix were calculated. Then， five attributes of face image were obtained by ResNet-50， and the semantic boundary of each attribute was calculated by Support Vector Machine （SVM）. Finally， the interpretable direction vectors of these attributes were calculated， which were as closed to the principal components vectors as possible and stayed as far away from the semantic boundary of the corresponding attribute as possible at the same time， thereby reducing the coupling between facial attributes， and improving the controllability in face editing. Because the algorithm has a closed-form solution， it has high efficiency. Experimental results show that the compared with closed-form Factorization of latent Semantics in GANs （SeFa） algorithm and Discovering Interpretable Generative Adversarial Network Controls （GANSpace） algorithm， the proposed algorithm increases the Inception Score （IS） by 19% and 26% respectively， decreases the Fréchet Inception Distance （FID） by 4% and 37% respectively， and decreases the Maximum Mean Discrepancy （MMD） by 15% and 48% respectively. It can be seen that this algorithm has good controllability and decoupling.

Key words: Generative Adversarial Network (GAN), face editing, latent space, semantic space, attribute semantic boundary

摘要：

针对人脸编辑存在的编辑结果不自然、生成图像变化较大等问题，提出了一种有闭解的可控人脸编辑算法。首先，随机采样 $n$ 个潜在向量来构造样本矩阵，并计算出该矩阵的前 $k$ 个主成分向量；然后，利用ResNet-50得到人脸图像的5个属性，并通过支持向量机（SVM）计算出各属性的语义边界；最后，计算这些属性的可解释方向向量，这些向量在尽量靠近主成分向量的同时也尽量远离对应属性的语义边界，从而减小人脸属性之间的耦合性，并提高编辑过程中的可控性。该算法具有闭解，因此效率较高。实验结果表明，所提算法和语义的闭式分解（SeFa）算法和可解释的生成对抗网络控制（GANSpace）算法相比，在初始分数（IS）上分别增加了19%和26%，在弗雷歇距离（FID）上分别减小了4%和37%，在最大平均差异（MMD）上分别减小了15%和48%。可见，该算法具有较好的可控性和解耦性。

关键词: 生成对抗网络, 人脸编辑, 潜在空间, 语义空间, 属性语义边界

CLC Number:

TP311.5

Lingling TAO, Bo LIU, Wenbo LI, Xiping HE. Controllable face editing algorithm with closed-form solution[J]. Journal of Computer Applications, 2023, 43(2): 601-607.

陶玲玲, 刘波, 李文博, 何希平. 有闭解的可控人脸编辑算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 601-607.

Figures/Tables 7

References 31

1	LIU M， DING Y K， XIA M， et al. STGAN： a unified selective transfer network for arbitrary image attribute editing［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 3668-3677. 10.1109/cvpr.2019.00379
2	LI T T， QIAN R H， DONG C， et al. BeautyGAN： instance-level facial makeup transfer with deep generative adversarial network［C］// Proceedings of the 26th ACM International Conference on Multimedia. New York： ACM， 2018： 645-653. 10.1145/3240508.3240618
3	KLUM S， HANH， JAIN A K， et al. Sketch based face recognition： forensic vs. composite sketches［C］// Proceedings of the 2013 International Conference on Biometrics. Piscataway： IEEE， 2013： 1-8. 10.1109/icb.2013.6612993
4	陈佛计，朱枫，吴清潇，等. 生成对抗网络及其在图像生成中的应用研究综述［J］. 计算机学报， 2021， 44（2）：347-369. 10.11897/SP.J.1016.2021.00347
	CHEN F J， ZHU F， WU Q X， et al. A survey about image generation with generative adversarial nets［J］. Chinese Journal of Computers， 2021， 44（2）： 347-369. 10.11897/SP.J.1016.2021.00347
5	ZHOU S C， XIAO T H， YANG Y， et al. GeneGAN： learning object transfiguration and attribute subspace from unpaired data［C］// Proceedings of the 2017 British Machine Vision Conference. Durham： BMVA Press， 2017： No.111. 10.5244/c.31.111
6	HE Z L， ZUO W M， KAN M N， et al. AttGAN： facial attribute editing by only changing what you want［J］. IEEE Transactions on Image Processing， 2019， 28（11）： 5464-5478. 10.1109/tip.2019.2916751
7	MIRZA M， OSINDERO S. Conditional generative adversarial nets［EB/OL］. （2014-11-06）［2021-09-10］..
8	KARRAS T， LAINE S， AILA T. A style-based generator architecture for generative adversarial networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2021， 43（12）： 4217-4228. 10.1109/tpami.2020.2970919
9	KARRAS T， AILA T， LAINE S， et al. Progressive growing of GANs for improved quality， stability， and variation［EB/OL］. （2018-02-26）［2021-09-10］..
10	KARRAS T， LAINE S， AITTALA M， et al. Analyzing and improving the image quality of StyleGAN［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 8107-8116. 10.1109/cvpr42600.2020.00813
11	HÄRKÖNEN E， HERTZMANN A， LEHTINEN J， et al. GANSpace： discovering interpretable GAN controls［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020：9841-9850.
12	胡铭菲，左信，刘建伟. 深度生成模型综述［J］. 自动化学报， 2022， 48（1）： 40-74. 10.16383/j.aas.c190866
	HU M F， ZUO X， LIU J W. Survey on deep generative model［J］. Acta Automatic Sinica， 2022， 48（1）： 40-74. 10.16383/j.aas.c190866
13	GREGOR K， DANIHELKA I， GRAVES A， et al. DRAW： a recurrent neural network for image generation［C］// Proceedings of the 32nd International Conference on Machine Learning. New York： JMLR.org， 2015： 1462-1471.
14	PAGNONIA， LIU K， LI S Y. Conditional variational autoencoder for neural machine translation［EB/OL］. （2018-12-11）［2021-09-10］..
15	BROCK A， DONAHUE J， SIMONYAN K. Large scale GAN training for high fidelity natural image synthesis［EB/OL］. （2019-02-25）［2021-09-10］..
16	SHOSHAN A， BHONKER N， KVIATKOVSKY I， et al. GAN-Control： explicitly controllable GANs［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 14063-14073. 10.1109/iccv48922.2021.01382
17	DENG Y， YANG J L， CHEN D， et al. Disentangled and controllable face image generation via 3D imitative-contrastive learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 5153-5162. 10.1109/cvpr42600.2020.00520
18	郭茂祖，杨倩楠，赵玲玲. 基于条件Wassertein生成对抗网络的图像生成［J］. 计算机应用， 2021， 41（5）：1432-1437. 10.11772/j.issn.1001-9081.2020071138
	GUO M Z， YANG Q N， ZHAO L L. Image generation based on conditional-Wassertein generative adversarial network［J］. Journal of Computer Applications， 2021， 41（5）： 1432-1437. 10.11772/j.issn.1001-9081.2020071138
19	GULRAJANI I， AHMED F， ARJOVSKY M， et al. Improved training of Wasserstein GANs［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 5769-5779.
20	CHEN X， DUAN Y， HOUTHOOFT R， et al. InfoGAN： interpretable representation learning by information maximizing generative adversarial nets［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2016： 2180-2188.
21	ODENA A， OLAH C， SHLENS J. Conditional image synthesis with auxiliary classifier GANs［C］// Proceedings of the 34th International Conference on Machine Learning. New York： JMLR.org， 2017： 2642-2651.
22	YANG C Y， SHEN Y J， ZHOU B L. Semantic hierarchy emerges in deep generative representations for scene synthesis［J］. International Journal of Computer Vision， 2021， 129（5）： 1451-1466. 10.1007/s11263-020-01429-5
23	SHEN Y J， YANG C Y， TANG X O， et al. InterFaceGAN： interpreting the disentangled face representation learned by GANs［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（4）： 2004-2018. 10.1109/tpami.2020.3034267
24	UPCHURCH P， GARDNER J， PLEISS G， et al. Deep feature interpolation for image content changes［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6090-6099. 10.1109/cvpr.2017.645
25	SHEN Y J， ZHOU B L. Closed-form factorization of latent semantics in GANs［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 1532-1540. 10.1109/cvpr46437.2021.00158
26	VOYNOV A， BABENKO A. Unsupervised discovery of interpretable directions in the GAN latent space［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 9786-9796.
27	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
28	LIU Z W， LUO P， WANG X G， et al. Deep learning face attributes in the wild［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 3730-3738. 10.1109/iccv.2015.425
29	SALIMANS T， GOODFELLOW I， ZAREMBA W， et al. Improved techniques for training GANs［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2016： 2234-2242.
30	HEUSEL M， RAMSAUER H， UNTERTHINER T， et al. GANs trained by a two time-scale update rule converge to a local Nash equilibrium［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6629-6640. 10.48550/arXiv.1706.08500
31	BIŃKOWSKI M， SUTHERLAND D J， ARBEL M， et al. Demystifying MMD GANs［EB/OL］. （2021-01-14）［2021-09-10］..

属性名	ProGAN	StyleGAN	StyleGAN2
age	1.00	0.40	0.60
hairstyle	0.20	0.60	0.40
gender	1.00	1.40	0.20
pose	1.20	0.80	0.80
smile	0.80	0.40	1.20

属性名	ProGAN	StyleGAN	StyleGAN2
age	1.00	0.40	0.60
hairstyle	0.20	0.60	0.40
gender	1.00	1.40	0.20
pose	1.20	0.80	0.80
smile	0.80	0.40	1.20

生成模型	属性名	SeFa算法			GANSpace算法			本文算法
生成模型	属性名	IS	FID	MMD	IS	FID	MMD	IS	FID	MMD
ProGAN	age	1.96	0.69	0.24	2.00	1.40	0.59	2.04	0.66	0.24
	hairstyle	2.05	0.55	0.22	1.85	0.47	0.47	2.29	0.49	0.17
	gender	2.02	0.73	0.24	2.00	1.40	0.59	2.27	0.69	0.19
	pose	2.13	0.64	0.33	2.43	0.43	0.39	2.35	0.36	0.22
	smile	2.07	0.62	0.25	1.85	0.91	0.24	2.21	0.65	0.25
	平均值	2.05	0.65	0.26	0.23	0.92	0.46	2.23	0.57	0.21
StyleGAN	age	1.92	0.60	0.27	1.63	2.45	0.86	2.17	0.74	0.23
	hairsyle	2.12	1.60	0.24	2.04	1.50	0.27	2.60	1.37	0.21
	gender	2.03	0.62	0.27	1.55	2.22	0.80	2.50	0.76	0.35
	pose	2.03	0.76	0.24	1.74	1.65	0.27	2.81	0.66	0.23
	smile	2.05	0.62	0.25	2.11	0.85	0.16	2.23	0.74	0.20
	平均值	2.03	0.84	0.25	1.81	1.73	0.47	2.46	0.85	0.24
StyleGAN2	age	2.50	0.77	0.30	1.97	0.67	0.40	2.51	0.67	0.26
	hairstyle	2.19	0.73	0.29	2.14	0.69	0.40	2.86	0.71	0.19
	gender	2.12	0.66	0.30	2.37	0.71	0.36	3.05	0.76	0.21
	pose	1.89	0.65	0.29	1.99	0.70	0.37	3.04	0.67	0.24
	smile	2.26	0.70	0.30	1.99	0.69	0.38	2.45	0.60	0.21
	平均值	2.19	0.70	0.30	2.09	0.69	0.38	2.78	0.68	0.22
总平均值		2.09	0.73	0.27	1.98	1.12	0.44	2.49	0.70	0.23

生成模型	属性名	SeFa算法			GANSpace算法			本文算法
生成模型	属性名	IS	FID	MMD	IS	FID	MMD	IS	FID	MMD
ProGAN	age	1.96	0.69	0.24	2.00	1.40	0.59	2.04	0.66	0.24
	hairstyle	2.05	0.55	0.22	1.85	0.47	0.47	2.29	0.49	0.17
	gender	2.02	0.73	0.24	2.00	1.40	0.59	2.27	0.69	0.19
	pose	2.13	0.64	0.33	2.43	0.43	0.39	2.35	0.36	0.22
	smile	2.07	0.62	0.25	1.85	0.91	0.24	2.21	0.65	0.25
	平均值	2.05	0.65	0.26	0.23	0.92	0.46	2.23	0.57	0.21
StyleGAN	age	1.92	0.60	0.27	1.63	2.45	0.86	2.17	0.74	0.23
	hairsyle	2.12	1.60	0.24	2.04	1.50	0.27	2.60	1.37	0.21
	gender	2.03	0.62	0.27	1.55	2.22	0.80	2.50	0.76	0.35
	pose	2.03	0.76	0.24	1.74	1.65	0.27	2.81	0.66	0.23
	smile	2.05	0.62	0.25	2.11	0.85	0.16	2.23	0.74	0.20
	平均值	2.03	0.84	0.25	1.81	1.73	0.47	2.46	0.85	0.24
StyleGAN2	age	2.50	0.77	0.30	1.97	0.67	0.40	2.51	0.67	0.26
	hairstyle	2.19	0.73	0.29	2.14	0.69	0.40	2.86	0.71	0.19
	gender	2.12	0.66	0.30	2.37	0.71	0.36	3.05	0.76	0.21
	pose	1.89	0.65	0.29	1.99	0.70	0.37	3.04	0.67	0.24
	smile	2.26	0.70	0.30	1.99	0.69	0.38	2.45	0.60	0.21
	平均值	2.19	0.70	0.30	2.09	0.69	0.38	2.78	0.68	0.22
总平均值		2.09	0.73	0.27	1.98	1.12	0.44	2.49	0.70	0.23

[1]	Shaokang XU, Zhancheng ZHANG, Haonan YAO, Zhiwei ZOU, Baocheng ZHANG. 2D/3D spine medical image real-time registration method based on pose encoder [J]. Journal of Computer Applications, 2023, 43(2): 589-594.
[2]	Li’an ZHU, Hong ZHANG. Nonhomogeneous image dehazing based on dual-branch conditional generative adversarial network [J]. Journal of Computer Applications, 2023, 43(2): 567-574.
[3]	Ruoying WANG, Fan LYU, Liuqing ZHAO, Fuyuan HU. Floorplan generation algorithm integrating user requirements and boundary constraints [J]. Journal of Computer Applications, 2023, 43(2): 575-582.
[4]	Gang CHEN, Yongwei LIAO, Zhenguo YANG, Wenying LIU. Image inpainting algorithm of multi-scale generative adversarial network based on multi-feature fusion [J]. Journal of Computer Applications, 2023, 43(2): 536-544.
[5]	Ziqi HU, Kai XIE, Chang WEN, Meiran LI, Jianbiao HE. Low dose CT image enhancement based on generative adversarial network [J]. Journal of Computer Applications, 2023, 43(1): 280-288.
[6]	Zanxia QIANG, Xianfu BAO. Residual attention deraining network based on convolutional long short-term memory [J]. Journal of Computer Applications, 2022, 42(9): 2858-2864.
[7]	Wentao MAO, Guifang WU, Chao WU, Zhi DOU. Animation video generation model based on Chinese impressionistic style transfer [J]. Journal of Computer Applications, 2022, 42(7): 2162-2169.
[8]	Zefang HAN, Xiong ZHANG, Hong SHANGGUAN, Xinglong HAN, Jing HAN, Gang FENG, Xueying CUI. Artifacts sensing generative adversarial network for low-dose CT denoising [J]. Journal of Computer Applications, 2022, 42(7): 2301-2310.
[9]	Jia LI, Yuanlin ZHENG, Kaiyang LIAO, Haojie LOU, Shiyu LI, Zehao CHEN. No-reference image quality assessment algorithm based on saliency deep features [J]. Journal of Computer Applications, 2022, 42(6): 1957-1964.
[10]	Yimin CAO, Lei CAI, Jingyang GAO. Gene data generation method based on generative adversarial network [J]. Journal of Computer Applications, 2022, 42(3): 783-790.
[11]	Dingkang YANG, Shuai HUANG, Shunli WANG, Peng ZHAI, Yidan LI, Lihua ZHANG. EE-GAN：facial expression recognition method based on generative adversarial network and network integration [J]. Journal of Computer Applications, 2022, 42(3): 750-756.
[12]	Yuming ZHAO, Shenkai GU. Adversarial attack defense model with residual dense block self-attention mechanism and generative adversarial network [J]. Journal of Computer Applications, 2022, 42(3): 921-929.
[13]	Yanbing GENG, Yongjian LIAN. Cross‑resolution person re‑identification by generative adversarial network based on multi‑granularity features [J]. Journal of Computer Applications, 2022, 42(11): 3573-3579.
[14]	Yuli CHEN, Qiang TONG, Tongtong CHEN, Shoulu HOU, Xiulei LIU. Short-term trajectory prediction model of aircraft based on attention mechanism and generative adversarial network [J]. Journal of Computer Applications, 2022, 42(10): 3292-3299.
[15]	GUAN Qijie, ZHANG Ting, LI Deya, ZHOU Shaojing, DU Yi. Indefinite reconstruction method of spatial data based on multi-resolution generative adversarial network [J]. Journal of Computer Applications, 2021, 41(8): 2306-2311.

Controllable face editing algorithm with closed-form solution

有闭解的可控人脸编辑算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 31

Related Articles 15

Recommended Articles

Metrics