计算机应用 ›› 2021, Vol. 41 ›› Issue (6): 1589-1596.DOI: 10.11772/j.issn.1001-9081.2020121914

所属专题: 2020年全国开放式分布与并行计算学术年会(DPCS 2020)

• 2020年全国开放式分布与并行计算学术年会(DPCS 2020) • 上一篇    下一篇

基于联合动态剪枝的深度神经网络压缩算法

张明明1, 卢庆宁2, 李文中2, 宋浒1   

  1. 1. 国网江苏省电力有限公司 信息通信分公司, 南京 210024;
    2. 计算机软件新技术国家重点实验室(南京大学), 南京 210023
  • 收稿日期:2020-11-04 修回日期:2021-03-29 出版日期:2021-06-10 发布日期:2021-06-21
  • 通讯作者: 卢庆宁
  • 作者简介:张明明(1974-),男,江苏常州人,高级工程师,硕士,主要研究方向:深度学习、卷积神经网络模型加速;卢庆宁(1997-),男,江苏南京人,硕士研究生,主要研究方向:机器学习、模型压缩、异常检测;李文中(1979-),男,广西平南人,教授,博士生导师,博士,CCF会员,主要研究方向:分布式计算、深度学习;宋浒(1986-),男,安徽合肥人,高级工程师,博士,主要研究方向:机器学习、大数据分析。
  • 基金资助:
    国网江苏省电力有限公司科技项目(J2020069)。

Deep neural network compression algorithm based on combined dynamic pruning

ZHANG Mingming1, LU Qingning2, LI Wenzhong2, SONG Hu1   

  1. 1. Information and Communication Branch, State Grid Jiangsu Electric Power Company Limited, Nanjing Jiangsu 210024, China;
    2. State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing Jiangsu 210023, China
  • Received:2020-11-04 Revised:2021-03-29 Online:2021-06-10 Published:2021-06-21
  • Supported by:
    This work is partially supported by the Science and Technology Project of State Grid Jiangsu Electric Power Company Limited (J2020069).

摘要: 作为模型压缩的一个分支,网络剪枝算法通过移除深度神经网络中不重要的参数来降低计算消耗;然而,永久性的剪枝会导致模型容量不可逆转的损失。针对该问题,提出了一种联合动态剪枝的算法来综合分析卷积核与输入图像的特征。一方面,将部分卷积核置零,并允许其在训练过程中更新,直到网络收敛之后再永久性移除被置零的卷积核。另一方面,采样输入图像的特征,然后利用通道重要性预测网络对这些特征进行分析,从而确定卷积运算中可以跳过的通道。基于M-CifarNet与VGG16的实验结果表明,联合动态剪枝分别取得了2.11和1.99的浮点运算压缩比,而与基准模型(M-CifarNet、VGG16)相比准确率仅分别下降不到0.8个百分点和1.2个百分点。相较于现有的网络剪枝算法,联合动态剪枝有效地减少了模型的浮点运算次数(FLOPs)以及参数规模,在同样的压缩比下获得了更高的准确率。

关键词: 模型压缩, 网络剪枝, 动态剪枝, 深度神经网络, 卷积核

Abstract: As a branch of model compression, network pruning algorithm reduces the computational cost by removing unimportant parameters in the deep neural network. However, permanent pruning will cause irreversible loss of the model capacity. Focusing on this issue, a combined dynamic pruning algorithm was proposed to comprehensively analyze the characteristics of the convolution kernel and the input image. Part of the convolution kernels were zeroized and allowed to be updated during the training process until the network converged, thereafter the zeroized kernels would be permanently removed. At the same time, the input images were sampled to extract their features, then a channel importance prediction network was used to analyze these features to determine the channels able to be skipped during the convolution operation. Experimental results based on M-CifarNet and VGG16 show that the combined dynamic pruning can respectively provide 2.11 and 1.99 floating-point operation compression ratios, with less than 0.8 percentage points and 1.2 percentage points accuracy loss respectively compared to the benchmark model (M-CifarNet、VGG16). Compared with the existing network pruning algorithms, the combined dynamic pruning algorithm effectively reduces the Floating-Point Operations Per second (FLOPs) and the parameter scale of the model, and achieves the higher accuracy under the same compression ratio.

Key words: model compression, network pruning, dynamic pruning, Deep Neural Network (DNN), convolution kernel

中图分类号: