《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (7): 1987-1994.DOI: 10.11772/j.issn.1001-9081.2023071027

• 人工智能 •    下一篇

基于低秩分解和向量量化的深度网络压缩方法

王东炜1,2,3, 刘柏辰1,2,3, 韩志1,2(), 王艳美1,2,3, 唐延东1,2   

  1. 1.机器人学国家重点实验室(中国科学院沈阳自动化研究所), 沈阳 110016
    2.中国科学院 机器人与智能制造研究院, 沈阳 110016
    3.中国科学院大学, 北京 100049
  • 收稿日期:2023-07-30 修回日期:2023-09-18 接受日期:2023-09-21 发布日期:2023-10-26 出版日期:2024-07-10
  • 通讯作者: 韩志
  • 作者简介:王东炜(1999—),男,河北唐山人,硕士研究生,主要研究方向:深度网络压缩、知识迁移;
    刘柏辰(1994—),男,吉林吉林人,博士研究生,主要研究方向:深度学习、深度网络压缩;
    王艳美(1996—),女,山东威海人,博士研究生,主要研究方向:迁移学习、域泛化;
    唐延东(1962—),男,山东聊城人,研究员,博士,主要研究方向:机器人视觉、图像处理、模式识别。
    第一联系人:韩志(1983—),男,辽宁沈阳人,研究员,博士,主要研究方向:计算机视觉、矩阵恢复、深度学习;
  • 基金资助:
    国家重点研发计划项目(2020YFB1313400)

Deep network compression method based on low-rank decomposition and vector quantization

Dongwei WANG1,2,3, Baichen LIU1,2,3, Zhi HAN1,2(), Yanmei WANG1,2,3, Yandong TANG1,2   

  1. 1.State Key Laboratory of Robotics (Shenyang Institute of Automation,Chinese Academy of Sciences),Shenyang Liaoning 110016,China
    2.Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang Liaoning 110016,China
    3.University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2023-07-30 Revised:2023-09-18 Accepted:2023-09-21 Online:2023-10-26 Published:2024-07-10
  • Contact: Zhi HAN
  • About author:WANG Dongwei, born in 1999, M. S. candidate. His research interests include deep network compression, knowledge transfer.
    LIU Baichen, born in 1994, Ph. D. candidate. His research interests include deep learning, deep network compression.
    WANG Yanmei, born in 1996, Ph. D. candidate. Her research interests include transfer learning, domain generalization.
    TANG Yandong, born in 1962, Ph. D., research fellow. His research interests include robot vision, image processing, pattern recognition.
    First author contact:HAN Zhi, born in 1983, Ph. D., research fellow. His research interests include computer vision, matrix completion, deep learning.
  • Supported by:
    National Key Research and Development Program(2020YFB1313400)

摘要:

随着人工智能的发展,深度神经网络成为多种模式识别任务中必不可少的工具,由于深度卷积神经网络(CNN)参数量巨大、计算复杂度高,将它部署到计算资源和存储空间受限的边缘计算设备上成为一项挑战。因此,深度网络压缩成为近年来的研究热点。低秩分解与向量量化是深度网络压缩中重要的两个研究分支,其核心思想都是通过找到原网络结构的一种紧凑型表达,从而降低网络参数的冗余程度。通过建立联合压缩框架,提出一种基于低秩分解和向量量化的深度网络压缩方法——可量化的张量分解(QTD)。该方法能够在网络低秩结构的基础上实现进一步的量化,从而得到更大的压缩比。在CIFAR-10数据集上对经典ResNet和该方法进行验证的实验结果表明,QTD能够在准确率仅损失1.71个百分点的情况下,将网络参数量压缩至原来的1%。而在大型数据集ImageNet上把所提方法与基于量化的方法PQF (Permute, Quantize, and Fine-tune)、基于低秩分解的方法TDNR (Tucker Decomposition with Nonlinear Response)和基于剪枝的方法CLIP-Q (Compression Learning by In-parallel Pruning-Quantization)进行比较与分析的实验结果表明,QTD能够在相同压缩范围下实现更好的分类准确率。

关键词: 卷积神经网络, 张量分解, 向量量化, 模型压缩, 图像分类

Abstract:

As the development of artificial intelligence, deep neural network has become an essential tool in various pattern recognition tasks. Deploying deep Convolutional Neural Networks (CNN) on edge computing equipment is challenging due to storage space and computing resource constraints. Therefore, deep network compression has become an important research topic in recent years. Low-rank decomposition and vector quantization are the most popular network compression techniques, which both try to find a compact representation of the original network, thereby reducing the redundancy of network parameters. By establishing a joint compression framework, a deep network compression method based on low-rank decomposition and vector decomposition — Quantized Tensor Decomposition (QTD) was proposed to obtain higher compression ratio by performing further quantization based on the low-rank structure of network. Experimental results of classical ResNet and the proposed method on CIFAR-10 dataset show that the volume can be compressed to 1% by QTD with a slight accuracy drop of 1.71 percentage points. Moreover, the proposed method was compared with the quantization-based method PQF (Permute, Quantize, and Fine-tune), the low-rank decomposition-based method TDNR (Tucker Decomposition with Nonlinear Response), and the pruning-based method CLIP-Q (Compression Learning by In-parallel Pruning-Quantization) on large dataset ImageNet. Experimental results show that QTD can maintain better classification accuracy with same compression range.

Key words: Convolutional Neural Network (CNN), tensor decomposition, vector quantization, model compression, image classification

中图分类号: