《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2162-2169.DOI: 10.11772/j.issn.1001-9081.2021050836

• 多媒体计算与计算机仿真 • 上一篇    

基于中国写意风格迁移的动漫视频生成模型

毛文涛1,2(), 吴桂芳1, 吴超1, 窦智1,2   

  1. 1.河南师范大学 计算机与信息工程学院, 河南 新乡 453007
    2.智慧商务与物联网技术河南省工程实验室(河南师范大学), 河南 新乡 453007
  • 收稿日期:2021-05-21 修回日期:2021-08-27 接受日期:2021-09-16 发布日期:2022-03-08 出版日期:2022-07-10
  • 通讯作者: 毛文涛
  • 作者简介:吴桂芳(1997—),女,河南信阳人,主要研究方向:机器视觉、风格迁移
    吴超(1998—),男,河南焦作人,硕士研究生,主要研究方向:机器学习、异常检测
    窦智(1984—),男,河南新乡人,副教授,博士,主要研究方向:机器学习、目标检测。
  • 基金资助:
    国家自然科学基金资助项目(U1904123);河南省科技攻关计划项目(212102210103)

Animation video generation model based on Chinese impressionistic style transfer

Wentao MAO1,2(), Guifang WU1, Chao WU1, Zhi DOU1,2   

  1. 1.College of Computer and Information Engineering,Henan Normal University,Xinxiang Henan 453007,China
    2.Engineering Lab of Intelligence Business and Internet of Things of Henan Province (Henan Normal University),Xinxiang Henan 453007,China
  • Received:2021-05-21 Revised:2021-08-27 Accepted:2021-09-16 Online:2022-03-08 Published:2022-07-10
  • Contact: Wentao MAO
  • About author:WU Guifang, born in 1997. Her research interests include machine vision, style transfer.
    WU Chao, born in 1998, M. S. candidate. His research interests include machine learning, abnormal detection.
    DOU Zhi, born in 1984, Ph. D., associate professor. His research interests include machine learning, target detection.
  • Supported by:
    National Natural Science Foundation of China(U1904123);Key Program of Henan Province Science and Technology Project(212102210103)

摘要:

目前生成式对抗网络(GAN)已经被用于图像的动漫风格转换。然而,现有基于GAN的动漫生成模型主要以日本动漫和美国动漫为对象,集中在写实风格的提取与生成,很少关注到中国风动漫中写意风格的迁移,因此限制了GAN在国内广大动漫制作市场中的应用。针对这一问题,通过将中国写意风格融入到GAN模型,提出了一种新的中国风动漫生成式对抗网络模型CCGAN,用以自动生成具有中国写意风格的动漫视频。首先,通过在生成器中增加反向残差块,构造了一个轻量级的深度神经网络模型,以降低视频生成的计算代价。其次,为了提取并迁移中国写意风格中图像边缘锐利、内容构造抽象、描边线条具有水墨质感等性质,在生成器中构造了灰度样式损失和颜色重建损失,以约束真实图像和中国风样例图像在风格上的高层语义一致性,并且在判别器中构造了灰度对抗损失和边缘促进对抗损失,以约束重构图像与样例图像保持相同的边缘特性。最终,采用Adam算法最小化上述损失函数,从而实现风格迁移,并将重构图像组合为视频。实验结果表明,与目前最具代表性的风格迁移模型CycleGAN与CartoonGAN相比,所提CCGAN可从以《中国唱诗班》为例的中国风动漫中有效地学习到中国写意风格,同时显著降低了计算代价,适合于大批量动漫视频的快速生成。

关键词: 生成式对抗网络, 中国风动漫, 风格迁移, 卡通, 深度神经网络

Abstract:

At present, Generative Adversarial Network (GAN) has been used for image animation style transformation. However, most of the existing GAN-based animation generation models mainly focus on the extraction and generation of realistic style with the targets of Japanese animations and American animations. Very little attention of the model is paid to the transfer of impressionistic style in Chinese-style animations, which limits the application of GAN in the domestic animation production market. To solve the problem, a new Chinese-style animation GAN model, namely Chinese Cartoon GAN (CCGAN), was proposed for the automatic generation of animation videos with Chinese impressionistic style by integrating Chinese impressionistic style into GAN model. Firstly, by adding the inverted residual blocks into the generator, a lightweight deep neural network model was constructed to reduce the computational cost of video generation. Secondly, in order to extract and transfer the characteristics of Chinese impressionistic style, such as sharp image edges, abstract content structure and stroke lines with ink texture, the gray-scale style loss and color reconstruction loss were constructed in the generator to constrain the high-level semantic consistency in style between the real images and the Chinese-style sample images. Moreover, in the discriminator, the gray-scale adversarial loss and edge-promoting adversarial loss were constructed to constrain the reconstructed image for maintaining the same edge characteristics of the sample images. Finally, the Adam algorithm was used to minimize the above loss functions to realize style transfer, and the reconstructed images were combined into video. Experimental results show that, compared with the current representative style transfer models such as CycleGAN and CartoonGAN, the proposed CCGAN can effectively learn the Chinese impressionistic style from Chinese-style animations such as Chinese Choir and significantly reduce the computational cost, indicating that the proposed CCGAN is suitable for the rapid generation of animation videos with large quantities.

Key words: Generative Adversarial Network (GAN), Chinese-style animation, style transfer, cartoon, Deep Neural Network (DNN)

中图分类号: