计算机应用 ›› 2020, Vol. 40 ›› Issue (12): 3644-3650.DOI: 10.11772/j.issn.1001-9081.2020050699

• 虚拟现实与多媒体计算 • 上一篇    下一篇

基于深度神经网络的移动端人像分割

杨坚伟1, 严群1,2, 姚剑敏1,2, 林志贤1   

  1. 1. 福州大学 物理与信息工程学院, 福州 350108;
    2. 晋江市博感电子科技有限公司, 福建 晋江 362200
  • 收稿日期:2020-05-25 修回日期:2020-07-13 出版日期:2020-12-10 发布日期:2020-08-14
  • 通讯作者: 严群(1965-),男,美籍华人,教授,博士,主要研究方向:Micro-LED、人工智能、信息显示。qunfyan@gmail.com
  • 作者简介:杨坚伟(1995-),男,福建泉州人,硕士研究生,主要研究方向:深度学习、图像语义分割;姚剑敏(1978-),男,福建莆田人,副研究员,博士,主要研究方向:人工智能、图像处理、信息显示;林志贤(1975-),男,福建泉州人,教授,博士,主要研究方向:信息显示、平板显示驱动系统、图像处理
  • 基金资助:
    国家重点研发计划项目(2016YFB0401503);广东省科技重大专项(2016B090906001);福建省科技重大专项(2014HZ0003-1);广东省光信息材料与技术重点实验室开放基金资助项目(2017B030301007)。

Portrait segmentation on mobile devices based on deep neural network

YANG Jianwei1, YAN Qun1,2, YAO Jianmin1,2, LIN Zhixian1   

  1. 1. College of Physics and Information Engineering, Fuzhou University, Fuzhou Fujian 350108, China;
    2. Jinjiang RichSense Electronic Technology Company Limited, Jinjiang Fujian 362200, China
  • Received:2020-05-25 Revised:2020-07-13 Online:2020-12-10 Published:2020-08-14
  • Supported by:
    This work is partially supported by National Key Research and Development Program of China (2016YFB0401503), the Science and Technology Major Program of Guangdong Province (2016B090906001), the Science and Technology Major Program of Fujian Province (2014HZ0003-1), the Open Foundation of Guangdong Provincial Key Laboratory of Optical Information Materials and Technology (2017B030301007).

摘要: 针对现有的人像分割算法大多忽略移动设备的硬件限制,盲目追求效果,以致无法满足移动端对于分割速度要求的问题,提出了一种可在移动设备上高效运行的人像分割网络。首先,基于编码器-解码器的轻量级U型架构来构建网络;其次,为了克服全卷积网络(FCN)受制于较小的感受域,无法充分捕获长距离信息的缺陷,引入期望最大化注意力块(EMAU)置于编码器之后、解码器之前;然后,在训练阶段添加多层边界辅助损失,有助于提高人物边界轮廓的准确度;最后,对模型进行量化和压缩。在Veer数据集上将所提网络与PortraitFCN+、ENet和BiSeNet等网络进行对比实验。实验结果表明,所提网络可以提高图像推理速度和分割效果,并能够以95.57%的准确率处理分辨率为224×224的RGB图像。

关键词: 深度神经网络, 人像分割, 期望最大化, 辅助损失, 注意力

Abstract: Most of the existing portrait segmentation algorithms ignore the hardware limitation of mobile devices and blindly pursue the effect, so that they cannot meet the segmentation speed requirement of mobile terminals. Therefore, a portrait segmentation network which could run efficiently on mobile devices was proposed. Firstly, the network was constructed based on the lightweight U-shaped architecture of encoder-decoder. Secondly, in order to make up for the fact that the Fully Convolutional Network (FCN) was limited by a small sensing domain, so that it was not able to fully capture the long-distance information, an Expectation Maximization Attention Unit (EMAU) was introduced after the encoder and before the decoder. Thirdly, for improving the accuracy of portrait boundary contour, a multi-layer boundary auxiliary loss was added at the training stage. Finally, the model was quantized and compressed. The proposed network was compared with other networks such as PortraitFCN+, ENet and BiSeNet on Veer dataset. Experimental results show that, the proposed network can improve the image reasoning speed and segmentation effect, as well as process the RGB images with the resolution of 224×224 at the accuracy of 95.57%.

Key words: deep neural network, portrait segmentation, Expectation Maximization (EM), auxiliary loss, attention

中图分类号: