Mixed precision neural network quantization method based on Octave convolution

doi:10.11772/j.issn.1001-9081.2020071106

Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (5): 1299-1304.DOI: 10.11772/j.issn.1001-9081.2020071106

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Mixed precision neural network quantization method based on Octave convolution

ZHANG Wenye^1,2, SHANG Fangxin², GUO Hao^2,3

1. School of Information, Renmin University of China, Beijing 100872, China;
2. Shanxi Extended Reality Industrial Technology Research Institute Company Limited, Taiyuan Shanxi 030024, China;
3. College of Information and Computer, Taiyuan University of Technology, Taiyuan Shanxi 030024, China

Received:2020-07-27 Revised:2020-09-18 Online:2021-05-10 Published:2020-12-23
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61672374).

基于Octave卷积的混合精度神经网络量化方法

张文烨^1,2, 尚方信², 郭浩^2,3

1. 中国人民大学信息学院, 北京 100872;
2. 山西虚拟现实产业技术研究院有限公司, 太原 030024;
3. 太原理工大学信息与计算机学院, 太原 030024

通讯作者: 张文烨
作者简介:张文烨(1994-),女,山西太原人,硕士研究生,主要研究方向:计算机视觉、软件自动化测试;尚方信(1994-),男,山西太原人,硕士,主要研究方向:机器学习、图像处理;郭浩(1981-),男,山西太原人,教授,博士,CCF会员,主要研究方向:视觉信息处理、人工智能、脑信息学。
基金资助:
国家自然科学基金资助项目（61672374）。

Abstract

Abstract: Deep neural networks with 32-bit weights require a lot of computing resources, making it difficult for large-scale deep neural networks to be deployed in limited computing power scenarios (such as edge computing). In order to solve this problem, a plug-and-play neural network quantification method was proposed to reduce the computational cost of large-scale neural networks and keep the model performance away from significant reduction. Firstly, the high-frequency and low-frequency components of the input feature map were separated based on Octave convolution. Secondly, the convolution kernels with different bits were respectively applied to the high- and low-frequency components for convolution operation. Thirdly, the high- and low-frequency convolution results were quantized to the corresponding bits by using different activation functions. Finally, the feature maps with different precisions were mixed to obtain the output of the layer. Experimental results verify the effectiveness of the proposed method on model compression. When the model was compressed to 1+8 bit(s), the proposed method had the accuracy dropped less than 3 percentage points on CIFAR-10/100 dataset; moreover, the proposed method made the ResNet50 structure based model compressed to 1+4 bit(s) with the accuracy higher than 70% on ImageNet dataset.

Key words: deep neural network, model quantification, model compression, convolutional neural network, deep learning

摘要： 浮点数位宽的深度神经网络需要大量的运算资源，这导致大型深度神经网络难以在低算力场景（如边缘计算）上部署。为解决这一问题，提出一种即插即用的神经网络量化方法，以压缩大型神经网络的运算成本，并保持模型性能指标不显著下降。首先，基于Octave卷积将输入特征图的高频和低频成分进行分离；其次，分别对高低频分量应用不同位宽的卷积核进行卷积运算；第三，使用不同位宽的激活函数将高低频卷积结果量化至相应位宽；最后，混合不同精度的特征图来获得该层卷积结果。实验结果证实了所提方法压缩模型的有效性，在CIFAR-10/100数据集上，将模型压缩至1+8位宽时，该方法可保持准确率指标的下降小于3个百分点；在ImageNet数据集上，使用该方法将ResNet50模型压缩至1+4位宽时，其正确率指标仍高于70%。

关键词: 深度神经网络, 模型量化, 模型压缩, 卷积神经网络, 深度学习

CLC Number:

TP391.41

ZHANG Wenye, SHANG Fangxin, GUO Hao. Mixed precision neural network quantization method based on Octave convolution[J]. Journal of Computer Applications, 2021, 41(5): 1299-1304.

张文烨, 尚方信, 郭浩. 基于Octave卷积的混合精度神经网络量化方法[J]. 计算机应用, 2021, 41(5): 1299-1304.

References

[1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc.,2012:1097-1105.
[2] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[3] REN S,HE K,GIRSHICK R,et al. Faster R-CNN:towards realtime object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:91-99.
[4] HE K,GKIOXARI G,DOLLÁR P,et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:2980-2988.
[5] LIN X,ZHAO C,PAN W. Towards accurate binary convolutional neural network[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc.,2017:344-352.
[6] ZHOU S,WU Y,NI Z,et al. DoReFa-Net:training low bitwidth convolutional neural networks with low bitwidth gradients[EB/OL].[2020-06-20]. https://arxiv.org/pdf/1606.06160.pdf.
[7] ZHUANG Z,TAN M,ZHUANG B,et al. Discrimination-aware channel pruning for deep neural networks[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook,NY:Curran Associates Inc.,2018:883-894.
[8] HE Y,LIN J,LIU Z,et al. AMC:AutoML for model compression and acceleration on mobile devices[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11211. Cham:Springer,2018:525-542.
[9] CAI H,ZHU L,HAN S. ProxylessNAS:direct neural architecture search on target task and hardware[C/OL]//Proceedings of the 2019 International Conference on Learning Representation.[2020-12-02]. https://arxiv.org/pdf/1812.00332.pdf.
[10] WU B,WANG Y,ZHANG P,et al. Mixed precision quantization of convnets via differentiable neural architecture search[EB/OL].[2019-11-30]. https://arxiv.org/pdf/1812.00090.pdf.
[11] CHEN Y,FAN H,XU B,et al. Drop an octave:reducing spatial redundancy in convolutional neural networks with octave convolution[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE,2019:3434-3443.
[12] BENGIO Y, LÉONARD N, COURVILLE A. Estimating or propagating gradients through stochastic neurons for conditional computation[EB/OL].[2020-08-15]. https://arxiv.org/pdf/1308.3432.pdf.
[13] RASTEGARI M,ORDONEZ V,REDMON J,et al. XNOR-Net:ImageNet classification using binary convolutional neural networks[C]//Proceedings of the 2016 European Conference on Computer Vision,LNCS 9908. Cham:Springer,2016:525-542.
[14] LIU Z,WU B,LUO W,et al. Bi-Real net:enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11219. Cham:Springer,2018:747-763.
[15] 邵伟平, 王兴, 曹昭睿, 等. 基于MobileNet与YOLOv3的轻量化卷积神经网络设计[J]. 计算机应用,2020,40(S1):8-13. (SHAO W P,WANG X,CAO Z R,et al. Design of lightweight convolutional neural network based on MobileNet and YOLOv3[J]. Journal of Computer Applications,2020,40(S1):8-13.)
[16] 屈伟. 基于FPGA的深度学习在图像识别上的优化与加速应用[D]. 成都:电子科技大学,2019:25-35.(QU W. Optimizing and accelerating application of deep learning in image recognition based on FPGA[D]. Chengdu:University of Electronic Science and Technology of China,2019:25-35.)
[17] 黄萱昆. 基于深度学习的移动端图像识别算法[D]. 北京:北京邮电大学,2018:30-33.(HUANG X K. Deep learning based image recognition algorithm for mobile devices[D]. Beijing:Beijing University of Posts and Telecommunications, 2018:30-33.)
[18] ZHANG D,YANG J,YE D,et al. LQ-Nets:learned quantization for highly accurate and compact deep neural networks[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11212. Cham:Springer,2018:373-390.

Mixed precision neural network quantization method based on Octave convolution

基于Octave卷积的混合精度神经网络量化方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	XIE Defeng, JI Jianmin. Syntax-enhanced semantic parsing with syntax-aware representation [J]. Journal of Computer Applications, 2021, 41(9): 2489-2495.
[2]	DAI Yurou, YANG Qing, ZHANG Fengli, ZHOU Fan. Trajectory prediction model of social network users based on self-supervised learning [J]. Journal of Computer Applications, 2021, 41(9): 2545-2551.
[3]	CHEN Chengrui, SUN Ning, HE Shibiao, LIAO Yong. Deep learning-based joint channel estimation and equalization algorithm for C-V2X communications [J]. Journal of Computer Applications, 2021, 41(9): 2687-2693.
[4]	SONG Zhongshan, LIANG Jiarui, ZHENG Lu, LIU Zhenyu, TIE Jun. Remote sensing scene classification based on bidirectional gated scale feature fusion [J]. Journal of Computer Applications, 2021, 41(9): 2726-2735.
[5]	LI Kangkang, ZHANG Jing. Multi-layer encoding and decoding model for image captioning based on attention mechanism [J]. Journal of Computer Applications, 2021, 41(9): 2504-2509.
[6]	ZHANG Yongbin, CHANG Wenxin, SUN Lianshan, ZHANG Hang. Detection method of domains generated by dictionary-based domain generation algorithm [J]. Journal of Computer Applications, 2021, 41(9): 2609-2614.
[7]	ZHAO Hong, KONG Dongyi. Chinese description of image content based on fusion of image feature attention and adaptive attention [J]. Journal of Computer Applications, 2021, 41(9): 2496-2503.
[8]	XU Jianglang, LI Linyan, WAN Xinjun, HU Fuyuan. Indoor scene recognition method combined with object detection [J]. Journal of Computer Applications, 2021, 41(9): 2720-2725.
[9]	MOU Changning, WANG Haipeng, ZHOU Piyu, HOU Xinhang. De novo peptide sequencing by tandem mass spectrometry based on graph convolutional neural network [J]. Journal of Computer Applications, 2021, 41(9): 2773-2779.
[10]	WANG Hebing, ZHANG Chunmei. Facial landmark detection based on ResNeXt with asymmetric convolution and squeeze excitation [J]. Journal of Computer Applications, 2021, 41(9): 2741-2747.
[11]	ZHENG Zhiqiang, HU Xin, WENG Zhi, WANG Yuhe, CHENG Xi. Cattle eye image feature extraction method based on improved DenseNet [J]. Journal of Computer Applications, 2021, 41(9): 2780-2784.
[12]	ZENG Xiangyin, ZHENG Bochuan, LIU Dan. Detection of left and right railway tracks based on deep convolutional neural network and clustering [J]. Journal of Computer Applications, 2021, 41(8): 2324-2329.
[13]	HE Zhenghai, XIAN Yantuan, WANG Meng, YU Zhengtao. Case reading comprehension method combining syntactic guidance and character attention mechanism [J]. Journal of Computer Applications, 2021, 41(8): 2427-2431.
[14]	HUANG Jishuang, ZHANG Hua, LI Yonglong, ZHAO Hao, WANG Haoran, FENG Chuncheng. Hydraulic tunnel defect recognition method based on dynamic feature distillation [J]. Journal of Computer Applications, 2021, 41(8): 2358-2365.
[15]	CAO Yuhong, XU Hai, LIU Sun'ao, WANG Zixiao, LI Hongliang. Review of deep learning-based medical image segmentation [J]. Journal of Computer Applications, 2021, 41(8): 2273-2287.