Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (5): 1299-1304.DOI: 10.11772/j.issn.1001-9081.2020071106

Special Issue: 人工智能

• Artificial intelligence • Previous Articles     Next Articles

Mixed precision neural network quantization method based on Octave convolution

ZHANG Wenye1,2, SHANG Fangxin2, GUO Hao2,3   

  1. 1. School of Information, Renmin University of China, Beijing 100872, China;
    2. Shanxi Extended Reality Industrial Technology Research Institute Company Limited, Taiyuan Shanxi 030024, China;
    3. College of Information and Computer, Taiyuan University of Technology, Taiyuan Shanxi 030024, China
  • Received:2020-07-27 Revised:2020-09-18 Online:2021-05-10 Published:2020-12-23
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61672374).

基于Octave卷积的混合精度神经网络量化方法

张文烨1,2, 尚方信2, 郭浩2,3   

  1. 1. 中国人民大学 信息学院, 北京 100872;
    2. 山西虚拟现实产业技术研究院有限公司, 太原 030024;
    3. 太原理工大学 信息与计算机学院, 太原 030024
  • 通讯作者: 张文烨
  • 作者简介:张文烨(1994-),女,山西太原人,硕士研究生,主要研究方向:计算机视觉、软件自动化测试;尚方信(1994-),男,山西太原人,硕士,主要研究方向:机器学习、图像处理;郭浩(1981-),男,山西太原人,教授,博士,CCF会员,主要研究方向:视觉信息处理、人工智能、脑信息学。
  • 基金资助:
    国家自然科学基金资助项目(61672374)。

Abstract: Deep neural networks with 32-bit weights require a lot of computing resources, making it difficult for large-scale deep neural networks to be deployed in limited computing power scenarios (such as edge computing). In order to solve this problem, a plug-and-play neural network quantification method was proposed to reduce the computational cost of large-scale neural networks and keep the model performance away from significant reduction. Firstly, the high-frequency and low-frequency components of the input feature map were separated based on Octave convolution. Secondly, the convolution kernels with different bits were respectively applied to the high- and low-frequency components for convolution operation. Thirdly, the high- and low-frequency convolution results were quantized to the corresponding bits by using different activation functions. Finally, the feature maps with different precisions were mixed to obtain the output of the layer. Experimental results verify the effectiveness of the proposed method on model compression. When the model was compressed to 1+8 bit(s), the proposed method had the accuracy dropped less than 3 percentage points on CIFAR-10/100 dataset; moreover, the proposed method made the ResNet50 structure based model compressed to 1+4 bit(s) with the accuracy higher than 70% on ImageNet dataset.

Key words: deep neural network, model quantification, model compression, convolutional neural network, deep learning

摘要: 浮点数位宽的深度神经网络需要大量的运算资源,这导致大型深度神经网络难以在低算力场景(如边缘计算)上部署。为解决这一问题,提出一种即插即用的神经网络量化方法,以压缩大型神经网络的运算成本,并保持模型性能指标不显著下降。首先,基于Octave卷积将输入特征图的高频和低频成分进行分离;其次,分别对高低频分量应用不同位宽的卷积核进行卷积运算;第三,使用不同位宽的激活函数将高低频卷积结果量化至相应位宽;最后,混合不同精度的特征图来获得该层卷积结果。实验结果证实了所提方法压缩模型的有效性,在CIFAR-10/100数据集上,将模型压缩至1+8位宽时,该方法可保持准确率指标的下降小于3个百分点;在ImageNet数据集上,使用该方法将ResNet50模型压缩至1+4位宽时,其正确率指标仍高于70%。

关键词: 深度神经网络, 模型量化, 模型压缩, 卷积神经网络, 深度学习

CLC Number: