《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (4): 1269-1274.DOI: 10.11772/j.issn.1001-9081.2021071274

• CCF第36届中国计算机应用大会 (CCF NCCA 2021) • 上一篇    

级联跨域特征融合的虚拟试衣

胡新荣1,2, 张君宇1,2, 彭涛1,2(), 刘军平1,2, 何儒汉1,2, 何凯1,2   

  1. 1.湖北省服装信息化工程技术研究中心(武汉纺织大学),武汉 430200
    2.武汉纺织大学 计算机与人工智能学院,武汉 430200
  • 收稿日期:2021-07-16 修回日期:2021-10-04 接受日期:2021-10-13 发布日期:2022-04-15 出版日期:2022-04-10
  • 通讯作者: 彭涛
  • 作者简介:胡新荣(1973—),女,湖北武汉人,教授,博士,CCF会员,主要研究方向:图像处理、模式识别、虚拟现实、自然语言处理
    张君宇(1996—),男,湖北荆州人,硕士研究生,CCF会员,主要研究方向:虚拟试衣
    刘军平(1979—),男,湖北武汉人,副教授,博士,CCF会员,主要研究方向:工业大数据、人工智能
    何儒汉(1974—),男,湖北武汉人,教授,博士,CCF会员,主要研究方向:机器学习、大数据分析、计算机视觉
    何凯(1987—),男,湖北武汉人,讲师,博士,CCF会员,主要研究方向:网络安全、云存储数据安全审计。
  • 基金资助:
    湖北省高等学校优秀中青年科技创新团队计划项目(T201807)

Cascaded cross-domain feature fusion for virtual try-on

Xinrong HU1,2, Junyu ZHANG1,2, Tao PENG1,2(), Junping LIU1,2, Ruhan HE1,2, Kai HE1,2   

  1. 1.Engineering Research Center of Hubei Province in Clothing Informationization (Wuhan Textile University),Wuhan Hubei 430200,China
    2.School of Computer Science and Artificial Intelligence,Wuhan Textile University,Wuhan Hubei 430200,China
  • Received:2021-07-16 Revised:2021-10-04 Accepted:2021-10-13 Online:2022-04-15 Published:2022-04-10
  • Contact: Tao PENG
  • About author:HU Xinrong, born in 1973, Ph. D., professor. Her research interests include image processing, pattern recognition, virtual reality, natural language processing.
    ZHANG Junyu, born in 1996, M. S. candidate. His research interests include virtual try-on.
    LIU Junping, born in 1979, Ph. D., associate professor. His research interests include industrial big data, artificial intelligence.
    HE Ruhan, born in 1974, Ph. D., professor. His research interests include machine learning, big data analysis, computer vision.
    HE Kai, born in 1987, Ph. D., lecturer. His research interests include network security, cloud storage data security audit.
  • Supported by:
    Hubei Province Colleges and Universities Outstanding Young and Middle-Aged Technological Innovation Team Program(T201807)

摘要:

基于图像合成蒙版策略的虚拟试衣技术在扭曲服装和人体融合时能较好地保留服装细节。然而由于在试衣过程中人体和服装的位置和结构难以对齐,试衣结果容易产生严重的遮挡,影响视觉效果。为解决试衣过程中的遮挡问题,提出了一种基于U-Net的生成器。该生成器在U-Net解码器上添加级联的空间和通道注意力模块,从而实现了着装人体的局部特征和扭曲服装的和全局特征的跨域融合。形式上,首先采用卷积网络预测薄板样条插值(TPS)变换的方法将服装根据目标人体姿态进行扭曲;然后,将着装人体解析信息和扭曲服装输入到提出的生成器中,并获取对应服装区域的掩码图像以渲染中间结果;最后,采用掩码合成的策略来通过掩码处理将扭曲服装与中间结果合成得到最终的试衣结果。实验结果表明,所提方法不仅可以减少遮挡,而且增强了图像细节,相较于CP-VTON方法,产生的图像的平均峰值信噪比(PSNR)提高了10.47%,平均FID减小了47.28%,平均结构相似性(SSIM)提高了4.16%。

关键词: 虚拟试衣, 注意力机制, 特征融合, 遮挡处理, 级联, 跨域特征融合

Abstract:

The virtual try-on technologies based on image synthesis mask strategy can better retain details of the clothing when the warped clothing is fused with the human body. However, because the position and structure of the human body and the clothing are difficult to align during the try-on process, the try-on result is likely to produce severe occlusion, affecting visual effect. In order to solve the occlusion in the try-on process, a U-Net based generator was proposed. In the generator, a cascaded spatial attention module and a channel attention module were added to the U-Net decoder, thereby achieving the cross-domain fusion between local features of warped clothes and global features of the human body. Formally, first, by predicting the Thin Plate Spline (TPS) conversion using the convolutional network, the clothing was distorted according to the target human body pose. Then, the dressed-on person representation information and the warped clothing were input into the proposed generator, and the mask image of the corresponding clothing area was obtained to render the intermediate result. Finally, the strategy of mask synthesis was used to synthesize the warped clothing with the intermediate result through mask processing to obtain the final try-on result. Experimental results show that the proposed method can not only reduce occlusion, but also enhance image details. Compared with Characteristic-Preserving Virtual Try-On Network (CP-VTON) method, the proposed method has the generated image with the average Peak Signal-to-Noise Ratio (PSNR) increased by 10.47%, the average Fréchet Inception Distance (FID) decreased by 47.28%, and the average Structural SIMilarity (SSIM) increased by 4.16%.

Key words: virtual try-on, attention mechanism, feature fusion, occlusion processing, cascaded, cross-domain feature fusion

中图分类号: