Accelerated compression method for convolutional neural network combining with pruning and stream merging

doi:10.11772/j.issn.1001-9081.2019081363

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (3): 621-625.DOI: 10.11772/j.issn.1001-9081.2019081363

• Artificial intelligence • Next Articles

Accelerated compression method for convolutional neural network combining with pruning and stream merging

XIE Binhong¹, ZHONG Rixin¹, PAN Lihu^1,2, ZHANG Yingjun¹

1. Department of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan Shanxi 030024, China;
2. Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101 China

Received:2019-08-06 Revised:2019-10-10 Online:2019-10-31 Published:2020-03-10
Supported by:
This work is partially supported by the Science and Technology Major Project of Shanxi Province (20141101001), the Key Research and Development Project of Shanxi Province (201803D121048).

结合剪枝与流合并的卷积神经网络加速压缩方法

谢斌红¹, 钟日新¹, 潘理虎^1,2, 张英俊¹

1. 太原科技大学计算机科学与技术学院, 太原 030024;
2. 中国科学院地理科学与资源研究所, 北京 100101

通讯作者: 钟日新
作者简介:谢斌红(1972-),男,山西万荣人,副教授,硕士,主要研究方向:软件体系结构、服务计算;钟日新(1995-),男,山西朔州人,硕士研究生,主要研究方向:软件体系架构、深度学习;潘理虎(1974-),男,河南驻马店人,副教授,博士,主要研究方向:人工智能、软件工程;张英俊(1969-),男,山西河津人,高级工程师,硕士,主要研究方向:软件体系结构、智能软件。
基金资助:
山西省科技重大专项（20141101001）；山西省重点计划研发项目（201803D121048）。

Abstract

Abstract: Deep convolutional neural networks are generally large in scale and complex in computation, which limits their application in high real-time and resource-constrained environments. Therefore, it is necessary to optimize the compression and acceleration of the existing structures of convolutional neural networks. In order to solve this problem, a hybrid compression method combining pruning and stream merging was proposed. In the method， the model was decompressed through different angles, further reducing the memory consumption and time consumption caused by parameter redundancy and structural redundancy. Firstly, the redundant parameters in each layer were cut off from the inside of the model. Then the non-essential layers were merged with the important layers from the structure of the model. Finally, the accuracy of the model was restored by retraining. The experimental results on the MNIST dataset show that the proposed hybrid compression method compresses LeNet-5 to 1/20 and improves its running speed by 8 times without reducing the accuracy of the model.

Key words: Convolutional Neural Network (CNN), model compression, network pruning, stream merging, redundancy

摘要： 深度卷积神经网络因规模庞大、计算复杂而限制了其在实时要求高和资源受限环境下的应用，因此有必要对卷积神经网络现有的结构进行优化压缩和加速。为了解决这一问题，提出了一种结合剪枝、流合并的混合压缩方法。该方法通过不同角度去压缩模型，进一步降低了参数冗余和结构冗余所带来的内存消耗和时间消耗。首先，从模型的内部将每层中冗余的参数剪去；然后，从模型的结构上将非必要的层与重要的层进行流合并；最后，通过重新训练来恢复模型的精度。在MNIST数据集上的实验结果表明，提出的混合压缩方法在不降低模型精度前提下，将LeNet-5压缩到原来的1/20，运行速度提升了8倍。

关键词: 卷积神经网络, 模型压缩, 网络剪枝, 流合并, 冗余

CLC Number:

TP183

XIE Binhong, ZHONG Rixin, PAN Lihu, ZHANG Yingjun. Accelerated compression method for convolutional neural network combining with pruning and stream merging[J]. Journal of Computer Applications, 2020, 40(3): 621-625.

谢斌红, 钟日新, 潘理虎, 张英俊. 结合剪枝与流合并的卷积神经网络加速压缩方法[J]. 计算机应用, 2020, 40(3): 621-625.

References

[1] LECUN Y A,BOTTOU L,ORR G B,et al. Efficient backprop[M]//ORR G B,MÜLLER K R. Neural Networks:Tricks of the Trade,LNCS 7700. Berlin:Springer,2012:9-48.
[2] HASSIBI B,STORK D G. Second order derivatives for network pruning:optimal brain surgeon[C]//Proceedings of the 5th International Conference on Neural Information Processing Systems. San Francisco:Morgan Kaufmann Publishers Inc.,1992:164-171.
[3] SRINIVAS S,BADU R V. Learning neural network architectures using backpropagation[C]//Proceedings of the 2016 British Machine Vision Conference. Durham:BMVA Press,2016:No. 104.
[4] WEN W,WU C,WANG Y,et al. Learning structured sparsity in deep neural networks[C]//Proceedings of the 30th International Conference on Neural Information Processing System. New York:Curran Associates Inc.,2016:2082-2090.
[5] LI H,KADAV A,DURDANOVIC I,et al. Pruning filters for efficient convnets[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1608.08710.pdf.
[6] GUPTA S,AGRAWAL A,GOPALAKRISHNAN K,et al. Deep learning with limited numerical precision[C]//Proceedings of the 32nd International Conference on Machine Learning. New York:JMLR.org,2015:1737-1746.
[7] GYSEL P,MOTAMEDI M,GHIASI S. Hardware-oriented approximation of convolutional neural networks[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1604.03168.pdf.
[8] DENIL M,SHAKIBI B,DINH L,et al. Predicting parameters in deep learning[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. New York:Curran Associates Inc.,2013:2148-2516.
[9] JADERBERG M,VEDALI A,ZISSERMAN A. Speeding up convolutional neural networks with low rank expansions[C]//Proceedings of the 2014 British Machine Vision Conference. Durham:BMVA Press,2014:No. 73.
[10] DENTON E,ZAZREMBA W,BRUNA J,et al. Exploiting linear structure within convolutional networks for efficient evaluation[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2014:1269-1277.
[11] BUCILUǍ C,CARUANA R,NICULESCU-MIZIL A. Model compression[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2006:535-541.
[12] HINTON G,VINYALS O,DEAN J. Distilling the knowledge in a neural network[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1503.02531.pdf.
[13] HUANG Z,WANG N. Like what you like:knowledge distill via neuron selectivity transfer[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1707.01219.pdf.
[14] LI D,WANG X,KONG D. DeepRebirth:accelerating deep neural network execution on mobile devices[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2018:2322-2330.
[15] LECUN Y,BOTTOU L,BENGIO Y,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998,86(11):2278-2324.
[16] 黄文坚, 唐源. TesnsorFlow实战[M]. 北京:电子工业出版社, 2017:233-242. (HUANG W J,TANG Y. Actual Combat of TensorFlow[M]. Beijing:Publishing House of Electronics Industry, 2017:233-242.)
[17] HAN S,POOL J,TRAN J,et al. Learning both weights and connections for efficient neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:1135-1143.
[18] LIU Z,LI J,SHEN Z,et al. Learning efficient convolutional networks through network slimming[C]//Proceedings of the 2017 International Conference on Computer Vision. Piscataway:IEEE, 2017:2755-2763.
[19] 靳丽蕾, 杨文柱, 王思乐, 等. 一种用于卷积神经网络压缩的混合剪枝方法[J]. 小型微型计算机系统,2018,39(12):2596-2601. (JIN L L,YANG W Z,WANG S L. et al. Mixed pruning method for convolution neural network compression[J]. Journal of Chinese Computer Systems,2018,39(12):2596-2601.
[20] FRANKLE J,CARBIN M. The lottery ticket hypothesis:finding sparse trainable neural networks[EB/OL].[2019-06-20]. https://arxiv.org/pdf/1803.03635.pdf.

Accelerated compression method for convolutional neural network combining with pruning and stream merging

结合剪枝与流合并的卷积神经网络加速压缩方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
[2]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[3]	Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242.
[4]	Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994.
[5]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[6]	Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759.
[7]	Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545.
[8]	Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478.
[9]	Jie WANG, Hua MENG. Image classification algorithm based on overall topological structure of point cloud [J]. Journal of Computer Applications, 2024, 44(4): 1107-1113.
[10]	Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(4): 1114-1120.
[11]	Lijun XU, Hui LI, Zuyang LIU, Kansong CHEN, Weixuan MA. 3D-GA-Unet： MRI image segmentation algorithm for glioma based on 3D-Ghost CNN [J]. Journal of Computer Applications, 2024, 44(4): 1294-1302.
[12]	Ruifeng HOU, Pengcheng ZHANG, Liyuan ZHANG, Zhiguo GUI, Yi LIU, Haowen ZHANG, Shubin WANG. Iterative denoising network based on total variation regular term expansion [J]. Journal of Computer Applications, 2024, 44(3): 916-921.
[13]	Jingxian ZHOU, Xina LI. UAV detection and recognition based on improved convolutional neural network and radio frequency fingerprint [J]. Journal of Computer Applications, 2024, 44(3): 876-882.
[14]	Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708.
[15]	Shengjie MENG, Wanjun YU, Ying CHEN. Feature selection algorithm for high-dimensional data with maximum correlation and maximum difference [J]. Journal of Computer Applications, 2024, 44(3): 767-771.