Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (3): 621-625.DOI: 10.11772/j.issn.1001-9081.2019081363

• Artificial intelligence •     Next Articles

Accelerated compression method for convolutional neural network combining with pruning and stream merging

XIE Binhong1, ZHONG Rixin1, PAN Lihu1,2, ZHANG Yingjun1   

  1. 1. Department of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan Shanxi 030024, China;
    2. Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101 China
  • Received:2019-08-06 Revised:2019-10-10 Online:2020-03-10 Published:2019-10-31
  • Supported by:
    This work is partially supported by the Science and Technology Major Project of Shanxi Province (20141101001), the Key Research and Development Project of Shanxi Province (201803D121048).


谢斌红1, 钟日新1, 潘理虎1,2, 张英俊1   

  1. 1. 太原科技大学 计算机科学与技术学院, 太原 030024;
    2. 中国科学院 地理科学与资源研究所, 北京 100101
  • 通讯作者: 钟日新
  • 作者简介:谢斌红(1972-),男,山西万荣人,副教授,硕士,主要研究方向:软件体系结构、服务计算;钟日新(1995-),男,山西朔州人,硕士研究生,主要研究方向:软件体系架构、深度学习;潘理虎(1974-),男,河南驻马店人,副教授,博士,主要研究方向:人工智能、软件工程;张英俊(1969-),男,山西河津人,高级工程师,硕士,主要研究方向:软件体系结构、智能软件。
  • 基金资助:

Abstract: Deep convolutional neural networks are generally large in scale and complex in computation, which limits their application in high real-time and resource-constrained environments. Therefore, it is necessary to optimize the compression and acceleration of the existing structures of convolutional neural networks. In order to solve this problem, a hybrid compression method combining pruning and stream merging was proposed. In the method, the model was decompressed through different angles, further reducing the memory consumption and time consumption caused by parameter redundancy and structural redundancy. Firstly, the redundant parameters in each layer were cut off from the inside of the model. Then the non-essential layers were merged with the important layers from the structure of the model. Finally, the accuracy of the model was restored by retraining. The experimental results on the MNIST dataset show that the proposed hybrid compression method compresses LeNet-5 to 1/20 and improves its running speed by 8 times without reducing the accuracy of the model.

Key words: Convolutional Neural Network (CNN), model compression, network pruning, stream merging, redundancy

摘要: 深度卷积神经网络因规模庞大、计算复杂而限制了其在实时要求高和资源受限环境下的应用,因此有必要对卷积神经网络现有的结构进行优化压缩和加速。为了解决这一问题,提出了一种结合剪枝、流合并的混合压缩方法。该方法通过不同角度去压缩模型,进一步降低了参数冗余和结构冗余所带来的内存消耗和时间消耗。首先,从模型的内部将每层中冗余的参数剪去;然后,从模型的结构上将非必要的层与重要的层进行流合并;最后,通过重新训练来恢复模型的精度。在MNIST数据集上的实验结果表明,提出的混合压缩方法在不降低模型精度前提下,将LeNet-5压缩到原来的1/20,运行速度提升了8倍。

关键词: 卷积神经网络, 模型压缩, 网络剪枝, 流合并, 冗余

CLC Number: