基于动态卷积自编码器的无监督人脸属性编辑方法

doi:10.11772/j.issn.1001-9081.2025040398

《计算机应用》唯一官方网站

• • 下一篇

基于动态卷积自编码器的无监督人脸属性编辑方法

崔选^1,2，刘波^1,2

1. 重庆工商大学人工智能学院，重庆 400067； 2. 智能感知与区块链技术重庆市重点实验室（重庆工商大学），重庆 400067

收稿日期:2025-04-14 修回日期:2025-06-30 接受日期:2025-07-02 发布日期:2025-07-07 出版日期:2025-07-07
通讯作者: 刘波
基金资助:
2023 年中国高校产学研创新基金

Unsupervised face attribute editing method based on dynamic convolutional autoencoder

Received:2025-04-14 Revised:2025-06-30 Accepted:2025-07-02 Online:2025-07-07 Published:2025-07-07
Supported by:
2023 China University-Industry-Research Collaborative Innovation Fund

摘要/Abstract

摘要： 基于生成对抗网（GANs）潜空间的无监督人脸属性编辑方法具有效率高、无需标注数据的优点，但这些方法在解耦性和可控性方面仍面临挑战，如在操控特定人脸属性时，可能会引起其他属性的意外变化，从而影响编辑效果；另外，还难以精确控制所编辑人脸属性的变化程度。针对基于GANs潜空间的无监督人脸属性编辑方法中在操控特定人脸属性时，可能会引起其他属性的意外变化等属性耦合问题，提出基于自编码器的无监督人脸属性编辑（AUFAE）方法。该方法通过在潜空间中学习有效的语义向量，实现对人脸属性的精准编辑。具体地，设计动态卷积自编码器网络（DCAE-Net）作为主干网络，该网络的编码器部分采用动态卷积（DyConv）的方式动态提取潜空间的局部特征，从而学习具有局部特性的语义向量；在解码器部分则融入通道注意力（CA）机制建立通道间的非线性依赖关系，使模型能够自主地聚焦不同语义相关的特征通道，有效促进语义向量的独立性学习。为了增强语义向量的解耦性和可控性，引入基于属性边界向量的损失函数训练DCAE-Net。此外，引入软正交损失确保语义向量之间相互独立，以进一步提升解耦性能。在3个预训练GAN生成模型上，AUFAE与3种主流的人脸属性编辑方法的对比实验结果表明，AUFAE与监督方法InterFaceGAN相比，学习感知图像块相似度（LPIPS）值平均减少了9%，结构相似性指数（SSIM）平均提升了7%；与无监督方法SDFlow相比，LPIPS值平均减少了5%，SSIM平均提升了5%；在直观视觉上，AUFAE在人脸属性编辑过程中也未出现属性耦合现象。以上结果说明AUFAE能够有效地缓解人脸编辑过程中的属性耦合问题，并实现更精确的人脸属性编辑。

关键词: 生成对抗网络, 语义向量, 人脸属性编辑, 属性边界向量, 动态卷积

Abstract: Unsupervised facial attribute editing methods based on the latent space of generative adversarial networks (GANs) offer advantages of high efficiency and annotation-free operation, yet they still face challenges in terms of attribute disentanglement and controllability—for instance, modifying a specific facial attribute may inadvertently alter other attributes, compromising editing quality, while precise control over the degree of attribute modification remains difficult. To address these issues, an Autoencoder-based Unsupervised Face Attribute Editing (AUFAE) method was proposed, which achieved precise facial attribute editing by learning effective semantic vectors in the latent space. Specifically, a Dynamic Convolutional Autoencoder Network (DCAE-Net) was employed as the backbone, where Dynamic Convolution (DyConv) was utilized by the encoder to adaptively extract local latent-space features, thereby enabling the learning of semantically meaningful vectors with localized characteristics. A Channel Attention (CA) mechanism was incorporated into the decoder to establish nonlinear dependencies between channels, allowing the model to autonomously focus on feature channels relevant to different semantics and enhancing the independence of learned semantic vectors. To improve disentanglement and controllability, an attribute boundary vector-based loss function was introduced to train the DCAE-Net. Additionally, a soft orthogonality loss was applied to ensure mutual independence among semantic vectors, further boosting disentanglement performance. Experiments conducted on three pre-trained GAN models compare AUFAE with three state-of-the-art face attribute editing methods. The experimental results demonstrate that compared to the supervised method InterFaceGAN, the proposed AUFAE achieves an average reduction of 9% in the Learned Perceptual Image Patch Similarity （LPIPS）metric and an average improvement of 7% in the Structural Similarity Index Measure （SSIM） metric. When compared to the unsupervised method SDFlow, AUFAE shows an average reduction of 5% in LPIPS and an average improvement of 5% in SSIM. In terms of visual perception, AUFAE also did not exhibit any attribute coupling phenomenon during the facial attribute editing process. The above results demonstrate that AUFAE can effectively mitigate the issue of attribute coupling in facial editing and achieve more precise face attribute manipulation.

Key words: Generative Adversarial Network (GAN), semantic vectors, face attribute Editing, attribute boundaries vector, Dynamic Convolution (DyConv)

中图分类号:

TP391

崔选刘波. 基于动态卷积自编码器的无监督人脸属性编辑方法[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2025040398.

[1]	邓伊琳, 余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2893-2901.
[2]	周金, 李玉芝, 张徐, 高硕, 张立, 盛家川. 复杂电磁环境下的调制识别网络[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2672-2682.
[3]	黄颖, 高胜美, 陈广, 刘苏. 结合信噪比引导的双分支结构和直方图均衡的低照度图像增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1971-1979.
[4]	李慧, 贾炳志, 王晨曦, 董子宇, 李纪龙, 仲兆满, 陈艳艳. 基于Swin Transformer的生成对抗网络水下图像增强模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1439-1446.
[5]	潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1300-1309.
[6]	上官宏, 任慧莹, 张雄, 韩兴隆, 桂志国, 王燕玲. 基于双编码器双解码器GAN的低剂量CT降噪模型[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 624-632.
[7]	高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242.
[8]	刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109.
[9]	孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215.
[10]	王昊冉, 于丹, 杨玉丽, 马垚, 陈永乐. 面向工控系统未知攻击的域迁移入侵检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1158-1165.
[11]	郑毅, 廖存燚, 张天倩, 王骥, 刘守印. 面向城区的基于图去噪的小区级RSRP估计方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 855-862.
[12]	周辉, 陈玉玲, 王学伟, 张洋文, 何建江. 基于生成对抗网络的联邦学习深度影子防御方案[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 223-232.
[13]	樊海玮, 鲁芯丝雨, 张丽苗, 安毅生. 融合知识图谱和图注意力网络的引文推荐算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2420-2425.
[14]	陈少权, 蔡剑平, 孙岚. 动态梯度阈值裁剪的差分隐私生成对抗网络算法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2065-2072.
[15]	刘安阳, 赵怀慈, 蔡文龙, 许泽超, 解瑞灯. 基于主动判别机制的自适应生成对抗网络图像去模糊算法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2288-2294.

基于动态卷积自编码器的无监督人脸属性编辑方法

Unsupervised face attribute editing method based on dynamic convolutional autoencoder

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics