Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 685-691.DOI: 10.11772/j.issn.1001-9081.2022010032

• Artificial intelligence • Previous Articles    

Improved method of convolution neural network based on matrix decomposition

Zhenliang LI1, Bo LI2()   

  1. 1.Faculty of Electronics and Information Engineering,Xi’an Jiaotong University,Xi’an Shaanxi 710049,China
    2.Computer Teaching & Experiment Center,Xi’an Jiaotong University,Xi’an Shaanxi 710049,China
  • Received:2022-01-11 Revised:2022-03-13 Accepted:2022-03-22 Online:2022-04-11 Published:2023-03-10
  • Contact: Bo LI
  • About author:LI Zhenliang, born in 1997, M. S. candidate. His research interests include deep learning, object detection.
    LI Bo, born in 1968, professor. His research interests include computer simulation, artificial intelligence.

基于矩阵分解的卷积神经网络改进方法

李振亮1, 李波2()   

  1. 1.西安交通大学 电子与信息学部,西安 710049
    2.西安交通大学 计算机教学实验中心,西安 710049
  • 通讯作者: 李波
  • 作者简介:李振亮(1997—),男,河南许昌人,硕士研究生,主要研究方向:深度学习、目标检测
    李波(1968—),男,陕西商洛人,教授,CCF会员,主要研究方向:计算机仿真、人工智能。

Abstract:

Aiming at the difficulty of optimizing the traditional Convolutional Neural Network (CNN) in the training process, an improved method of CNN based on matrix decomposition was proposed. Firstly, the convolution kernel parameter tensor of the model convolution layer during training was converted into the product of multiple parameter matrices through matrix decomposition to form overparameterization. Secondly, these additional linear parameters were added to the back propagation of the network and updated synchronously with other parameters of the model to improve the optimization process of gradient descent. After completing the training, the matrix product was restored to the standard convolution kernel parameters, so that the computational complexity of forward propagation during inference was able to be the same as before the improvement. With thin QR decomposition and reduced Singular Value Decomposition (SVD) applied, the classification effect experiments were carried out on CIFAR-10 (Canadian Institute For Advanced Research, 10 classes) dataset, and further generalization experiments were carried out by using different image classification datasets and different initialization methods. Experimental results show that the classification accuracies of 7 models of different depths of Visual Geometry Group (VGG) and Residual Network (ResNet) based on matrix decomposition are higher than those of the original convolutional neural network models. It can be seen that the matrix decomposition method can make CNN achieve higher classification accuracy, and eventually converge to a better local optimum.

Key words: Convolution Neural Network (CNN), matrix decomposition, Singular Value Decomposition (SVD), overparameterization, image classification

摘要:

针对传统卷积神经网络(CNN)在训练过程中优化难度高的问题,提出基于矩阵分解的CNN改进方法。首先,通过矩阵分解将模型卷积层在训练期间的卷积核参数张量转换为多个参数矩阵的乘积,形成过参数化;其次,将这些额外的线性参数加入网络的反向传播,并与模型的其他参数同步更新,以改善梯度下降的优化过程;完成训练后,将矩阵乘积重新还原为标准卷积核参数,从而使推理期间前向传播的计算复杂度与改进前保持一致。选用简化QR分解和简化奇异值分解(SVD),在CIFAR-10数据集上进行分类效果实验,并用不同的图像分类数据集和初始化方式作进一步的泛化实验。实验结果表明,基于矩阵分解的VGG和残差网络(ResNet)对7个不同深度模型的分类准确率均高于原网络模型,可见矩阵分解方法可以让CNN更快地达到较高的分类准确率,最终收敛得到更好的局部最优。

关键词: 卷积神经网络, 矩阵分解, 奇异值分解, 过参数化, 图像分类

CLC Number: