Deep neural network compression algorithm based on hybrid mechanism

doi:10.11772/j.issn.1001-9081.2022091392

Abstract

Abstract:

With the rapid development of Artificial Intelligence （AI） in recent years， the demand for Deep Neural Network （DNN） from devices with limited resources such as embedded devices and mobile devices has increased sharply. The problem of how to compress neural networks without affecting the effect of DNNs has great theoretical and practical significance， and is a hot research topic in deep learning now. Firstly， aiming at the problem that DNN is difficult to be ported to resource-limited devices such as mobile devices due to their large models and large computational cost， the experimental performance of existing DNN compression algorithms in terms of memory usage， running speed， and compression effect was deeply analyzed， so that the influence factors of the DNN compression algorithm were explored. Then， the knowledge transfer structure composed of student network and teacher network was designed， the knowledge distillation， structural design， network pruning， and parameter quantization mechanisms were fused together， and a DNN optimization and compression model based on hybrid mechanism was proposed. Experimental comparison and analysis were conducted on mini-ImageNet dataset using AlexNet as the Benchmark. Experimental results show that the capacity of compressed AlexNet is reduced by 98.5% with 6.3% loss of accuracy， which verify the effectiveness of the proposed algorithm.

Key words: Deep Neural Network (DNN), network compression, network pruning, knowledge distillation, parameter quantization

摘要：

近年来人工智能（AI）应用飞速发展，嵌入式设备与移动设备等有限资源设备对深度神经网络（DNN）的需求急剧增加。如何在不影响DNN效果的基础上对神经网络进行压缩具有极大理论与现实意义，也是当下深度学习的热门研究话题。首先，针对DNN因模型大、计算量大而难以移植至移动设备等有限资源设备的问题，深入分析已有DNN压缩算法在内存占用、运行速度及压缩效果等方面的实验性能，从而挖掘DNN压缩算法的影响要素；然后，设计学生网络和教师网络组成的知识迁移结构，融合知识蒸馏、结构设计、网络剪枝和参数量化机制，提出基于混合机制的DNN优化压缩算法。在mini-ImageNet数据集上以AlexNet为Benchmark，进行实验比较与分析。实验结果表明，所提算法在压缩结果的准确率降低6.3%的情况下，使压缩后的AlexNet的容量减小98.5%，验证了所提算法的有效性。

关键词: 深度神经网络, 网络压缩, 网络剪枝, 知识蒸馏, 参数量化

CLC Number:

TP183

Xujian ZHAO, Hanglin LI. Deep neural network compression algorithm based on hybrid mechanism[J]. Journal of Computer Applications, 2023, 43(9): 2686-2691.

赵旭剑, 李杭霖. 基于混合机制的深度神经网络压缩算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2686-2691.

Figures/Tables 15

Tab. 1 Comparison of classical neural networks

模型	层数	规模/MB	参数量/10⁶	错误率/%
AlexNet（原始）	8	>200	60.0	16.40
Visual Geometry Group	19	>500	138.0	7.32
GoogLeNet	22	≈50	6.8	6.67
ResNet	152	230	19.4	3.57

Tab. 2 Compression results of different compression algorithms on AlexNet

算法	准确率/%	压缩比	加速比
AlexNet	68.55	—	—
网络剪枝	66.44	1.03	1.09
线性参数量化	24.46	1.05	1.10
K-means参数量化	64.93	1.05	1.11
知识蒸馏	69.16	20.45	1.16
分组卷积	58.42	1.05	1.02

Tab. 3 Experimental results of compression algorithms

算法	准确率/%	压缩比	加速比	容量/MB
AlexNet（原始）	68.55	—	—	177.08
KD	64.93	20.45	1.11	—
KD+GC	66.74	50.11	1.06	—
KD+GC+NP	64.56	66.92	1.04	—
KD+GC+NP+CQ	64.25	89.42	1.08	2.65

Fig. 1 Model structure of the proposed algorithm

Fig. 2 Process of knowledge distillation

Fig. 3 Comparison of results of neural networks after knowledge distillation

Fig. 4 Depthwise separable convolution

Fig. 5 Group convolution

Fig. 6 Flow of network pruning

Fig. 7 Parameter quantization design based on K-means

Fig. 8 Data architecture of mini-ImageNet

Fig. 9 Examples of data

Fig. 10 Comparison of experimental results under different deletion thresholds

Fig. 11 Comparison of experimental results under different quantization bit numbers

Fig. 12 Comparison of accuracy loss under different grouping conditions

References 17

1	DENIL M， SHAKIBI B， DINH L， et al. Predicting parameters in deep learning［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. Red Hook， NY： Curran Associates Inc.， 2013： 2148-2156.
2	SETIONO R， LIU H. Neural-network feature selector［J］. IEEE Transactions on Neural Networks， 1997， 8（3）： 654-662. 10.1109/72.572104
3	LeCUN Y， DENKER J S， SOLLA S A. Optimal brain damage［M］// Advances in Neural Information Processing Systems 2. San Francisco： Morgan Kaufmann Publishers Inc.， 1990： 598-605.
4	WANG Y L， ZHANG X L， XIE L X， et al. Pruning from scratch［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 12273-12280. 10.1609/aaai.v34i07.6910
5	DONG X Y， YANG Y. Network pruning via transformable architecture search［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2019： 760-771. 10.1109/iccv.2019.00378
6	CHEN J T， ZHU Z C， LI C， et al. Self-adaptive network pruning［C］// Proceedings of the 2019 International Conference on Neural Information Processing， LNCS 11953. Cham： Springer， 2019： 175-186.
7	WEN W， WU C P， WANG Y D， et al. Learning structured sparsity in deep neural networks［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2016： 2082-2090.
8	HINTON G， VINYALS O， DEAN J. Distilling the knowledge in a neural network［EB/OL］. （2015-03-09）［2022-05-14］..
9	REMERO A， BALLAS N， KAHOU S E， et al. FitNets： hints for thin deep nets［EB/OL］. （2015-03-27）［2022-04-03］..
10	VANHOUCKE V， SENIOR A， MAO M Z. Improving the speed of neural networks on CPUs［EB/OL］. ［2022-02-15］..
11	HWANG K， SUNG W. Fixed-point feedforward deep neural network design using weights +1， 0， and -1［C］// Proceedings of the 2014 IEEE Workshop on Signal Processing Systems. Piscataway： IEEE， 2014： 1-6. 10.1109/sips.2014.6986082
12	CHEN W L， WILSON J T， TYREE S， et al. Compressing neural networks with the hashing trick［C］// Proceedings of the 32nd International Conference on Machine Learning. New York： JMLR.org， 2015： 2285-2294. 10.1145/2939672.2939839
13	CHEN W L， WILSON J T， TYREE S， et al. Compressing convolutional neural networks［EB/OL］. （2015-06-14）［2022-03-11］.. 10.1145/2939672.2939839
14	GONG Y C， LIU L， YANG M， et al. Compressing deep convolutional networks using vector quantization［EB/OL］. （2014-12-18）［2021-11-06］..
15	IANDOLA F N， HAN S， MOSKEWICZ M W， et al. SqueezeNet： AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size［EB/OL］. （2016-11-04）［2022-01-03］..
16	HOWARD A G， ZHU M L， CHEN B， et al. MobileNets： efficient convolutional neural networks for mobile vision applications［EB/OL］. （2017-04-17）［2022-05-11］.. 10.48550/arXiv.1704.04861
17	ZHANG X Y， ZHOU X Y， LIN M X， et al. ShuffleNet： an extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6848-6856. 10.1109/cvpr.2018.00716

[1]	Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902.
[2]	Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation [J]. Journal of Computer Applications, 2024, 44(8): 2421-2429.
[3]	Rui SHI, Yong LI, Yanhan ZHU. Adversarial sample attack algorithm of modulation signal based on equalization of feature gradient [J]. Journal of Computer Applications, 2024, 44(8): 2521-2527.
[4]	Xue LI, Guangle YAO, Honghui WANG, Jun LI, Haoran ZHOU, Shaoze YE. Remote sensing image classification based on sample incremental learning [J]. Journal of Computer Applications, 2024, 44(3): 732-736.
[5]	Mengmei YAN, Dongping YANG. Review of mean field theory for deep neural network [J]. Journal of Computer Applications, 2024, 44(2): 331-343.
[6]	Yuxin HUANG, Yiwang HUANG, Hui HUANG. Meta label correction method based on shallow network predictions [J]. Journal of Computer Applications, 2024, 44(11): 3364-3370.
[7]	Yunfei SHEN, Fei SHEN, Fang LI, Jun ZHANG. Deep neural network model acceleration method based on tensor virtual machine [J]. Journal of Computer Applications, 2023, 43(9): 2836-2844.
[8]	Xiaolin LI, Songjia YANG. Hybrid beamforming for multi-user mmWave relay networks using deep learning [J]. Journal of Computer Applications, 2023, 43(8): 2511-2516.
[9]	Zhangjian JI, Ming ZHANG, Zilong WANG. High-precision object detection algorithm based on improved VarifocalNet [J]. Journal of Computer Applications, 2023, 43(7): 2147-2154.
[10]	Haiyu YANG, Wenpu GUO, Kai KANG. Signal modulation recognition method based on convolutional long short-term deep neural network [J]. Journal of Computer Applications, 2023, 43(4): 1318-1322.
[11]	LIU Xiaoyu, CHEN Huaixin, LIU Biyuan, LIN Ying, MA Teng. License plate detection algorithm in unrestricted scenes based on adaptive confidence threshold [J]. Journal of Computer Applications, 2023, 43(1): 67-73.
[12]	GAO Yuanyuan, YU Zhenhua, DU Fang, SONG Lijuan. Unlabeled network pruning algorithm based on Bayesian optimization [J]. Journal of Computer Applications, 2023, 43(1): 30-36.
[13]	Huaiqing HE, Jianqing YAN, Kanghua HUI. Lightweight face recognition method based on deep residual network [J]. Journal of Computer Applications, 2022, 42(7): 2030-2036.
[14]	Wentao MAO, Guifang WU, Chao WU, Zhi DOU. Animation video generation model based on Chinese impressionistic style transfer [J]. Journal of Computer Applications, 2022, 42(7): 2162-2169.
[15]	Meng YU, Wentao HE, Xuchuan ZHOU, Mengtian CUI, Keqi WU, Wenjie ZHOU. Review of recommendation system [J]. Journal of Computer Applications, 2022, 42(6): 1898-1913.