Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (7): 1987-1994.DOI: 10.11772/j.issn.1001-9081.2023071027
• Artificial intelligence • Next Articles
Dongwei WANG1,2,3, Baichen LIU1,2,3, Zhi HAN1,2(), Yanmei WANG1,2,3, Yandong TANG1,2
Received:
2023-07-30
Revised:
2023-09-18
Accepted:
2023-09-21
Online:
2023-10-26
Published:
2024-07-10
Contact:
Zhi HAN
About author:
WANG Dongwei, born in 1999, M. S. candidate. His research interests include deep network compression, knowledge transfer.Supported by:
王东炜1,2,3, 刘柏辰1,2,3, 韩志1,2(), 王艳美1,2,3, 唐延东1,2
通讯作者:
韩志
作者简介:
王东炜(1999—),男,河北唐山人,硕士研究生,主要研究方向:深度网络压缩、知识迁移;基金资助:
CLC Number:
Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization[J]. Journal of Computer Applications, 2024, 44(7): 1987-1994.
王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023071027
原始 | 分解后 | ||||
---|---|---|---|---|---|
卷积层 | 输入/输出 通道数 | 卷积核大小 | 卷积层 | 输入/输出通道数 | 卷积核大小 |
conv1 | 64/64 | 3×3 | conv1.1 | 64/32 | 1×1 |
conv1.2 | 32/32 | 3×3 | |||
conv1.3 | 32/64 | 1×1 | |||
conv2 | 64/128 | 3×3 | conv2.1 | 64/32 | 1×1 |
conv2.2 | 32/64 | 3×3 | |||
conv2.3 | 64/128 | 1×1 | |||
conv3 | 128/256 | 3×3 | conv3.1 | 128/64 | 1×1 |
conv3.2 | 64/128 | 3×3 | |||
conv3.3 | 128/256 | 1×1 |
Tab. 1 Low-rank structures of different convolutional layers in ResNet
原始 | 分解后 | ||||
---|---|---|---|---|---|
卷积层 | 输入/输出 通道数 | 卷积核大小 | 卷积层 | 输入/输出通道数 | 卷积核大小 |
conv1 | 64/64 | 3×3 | conv1.1 | 64/32 | 1×1 |
conv1.2 | 32/32 | 3×3 | |||
conv1.3 | 32/64 | 1×1 | |||
conv2 | 64/128 | 3×3 | conv2.1 | 64/32 | 1×1 |
conv2.2 | 32/64 | 3×3 | |||
conv2.3 | 64/128 | 1×1 | |||
conv3 | 128/256 | 3×3 | conv3.1 | 128/64 | 1×1 |
conv3.2 | 64/128 | 3×3 | |||
conv3.3 | 128/256 | 1×1 |
模型 | 压缩域 | ||||
---|---|---|---|---|---|
ResNet-18 | 小 | 1.6 | 2 | 2 | |
中 | 2.0 | 4 | 4 | ||
大 | 4.0 | 4 | 4 | ||
ResNet-50 | 小 | 2.0 | 4 | 4 | |
中 | 4.0 | 8 | 4 | ||
大 | 4.0 | 8 | 8 |
Tab. 2 Compression parameter setting of ResNet
模型 | 压缩域 | ||||
---|---|---|---|---|---|
ResNet-18 | 小 | 1.6 | 2 | 2 | |
中 | 2.0 | 4 | 4 | ||
大 | 4.0 | 4 | 4 | ||
ResNet-50 | 小 | 2.0 | 4 | 4 | |
中 | 4.0 | 8 | 4 | ||
大 | 4.0 | 8 | 8 |
层次 | 浮点运算量/MFLOPs | |
---|---|---|
PQF(压缩比为23.01) | QTD(压缩比为25.38) | |
合计 | 556.65 | 213.24 |
conv1 | 1.90 | 1.90 |
Res1 | 151.52 | 55.05 |
Res2 | 134.55 | 52.23 |
Res3 | 134.38 | 52.07 |
Res4 | 134.30 | 51.99 |
Linear | 0.01 | 0.01 |
Tab. 3 Comparison of FLOPs in compressed ResNet-18
层次 | 浮点运算量/MFLOPs | |
---|---|---|
PQF(压缩比为23.01) | QTD(压缩比为25.38) | |
合计 | 556.65 | 213.24 |
conv1 | 1.90 | 1.90 |
Res1 | 151.52 | 55.05 |
Res2 | 134.55 | 52.23 |
Res3 | 134.38 | 52.07 |
Res4 | 134.30 | 51.99 |
Linear | 0.01 | 0.01 |
方法 | 压缩比 | 准确率/% | 压缩后准确率下降百分点 | |
---|---|---|---|---|
原始 | 压缩后 | |||
BGD | 25.12 | 69.1 | 67.39 | 1.71 |
PQF | 25.33 | 69.1 | 67.88 | 1.22 |
QTD | 24.58 | 69.1 | 65.27 | 3.83 |
ABC-Net | 32.05 | 69.1 | 62.80 | 6.30 |
BWN | 32.17 | 69.1 | 60.78 | 8.32 |
LR-Net | 31.89 | 69.1 | 59.90 | 9.20 |
BGD | 35.21 | 69.1 | 64.12 | 4.98 |
PQF | 35.14 | 69.1 | 65.23 | 3.87 |
QTD | 30.91 | 69.1 | 64.94 | 4.16 |
BGD | 43.23 | 69.1 | 61.17 | 7.93 |
PQF | 43.22 | 69.1 | 63.33 | 5.77 |
QTD | 43.61 | 69.1 | 61.22 | 7.88 |
PQF | 56.74 | 69.1 | 59.87 | 9.23 |
QTD | 58.13 | 69.1 | 60.71 | 8.39 |
PQF | 59.53 | 69.1 | 58.92 | 10.18 |
QTD | 60.67 | 69.1 | 59.33 | 9.77 |
Tab. 4 Experimental results of ResNet-18 on ImageNet
方法 | 压缩比 | 准确率/% | 压缩后准确率下降百分点 | |
---|---|---|---|---|
原始 | 压缩后 | |||
BGD | 25.12 | 69.1 | 67.39 | 1.71 |
PQF | 25.33 | 69.1 | 67.88 | 1.22 |
QTD | 24.58 | 69.1 | 65.27 | 3.83 |
ABC-Net | 32.05 | 69.1 | 62.80 | 6.30 |
BWN | 32.17 | 69.1 | 60.78 | 8.32 |
LR-Net | 31.89 | 69.1 | 59.90 | 9.20 |
BGD | 35.21 | 69.1 | 64.12 | 4.98 |
PQF | 35.14 | 69.1 | 65.23 | 3.87 |
QTD | 30.91 | 69.1 | 64.94 | 4.16 |
BGD | 43.23 | 69.1 | 61.17 | 7.93 |
PQF | 43.22 | 69.1 | 63.33 | 5.77 |
QTD | 43.61 | 69.1 | 61.22 | 7.88 |
PQF | 56.74 | 69.1 | 59.87 | 9.23 |
QTD | 58.13 | 69.1 | 60.71 | 8.39 |
PQF | 59.53 | 69.1 | 58.92 | 10.18 |
QTD | 60.67 | 69.1 | 59.33 | 9.77 |
方法 | 压缩比 | 准确率/% | 压缩后准确率下降百分点 | |
---|---|---|---|---|
原始 | 压缩后 | |||
CLIP-Q | 14.90 | 76.15 | 73.77 | 2.38 |
HAQ | 15.20 | 76.15 | 70.63 | 5.52 |
DC | 15.18 | 76.15 | 68.90 | 7.25 |
BGD | 15.20 | 76.15 | 74.81 | 1.34 |
PQF | 16.82 | 76.15 | 75.42 | 0.73 |
QTD | 16.48 | 76.15 | 75.26 | 0.89 |
BGD | 25.91 | 76.15 | 71.53 | 4.62 |
PQF | 25.93 | 76.15 | 73.64 | 2.51 |
QTD | 23.54 | 76.15 | 71.75 | 4.40 |
PQF | 29.37 | 76.15 | 70.22 | 5.93 |
QTD | 30.16 | 76.15 | 70.74 | 5.41 |
PQF | 32.16 | 76.15 | 69.13 | 7.02 |
QTD | 33.57 | 76.15 | 69.85 | 6.30 |
Tab. 5 Experimental results of ResNet-50 on ImageNet
方法 | 压缩比 | 准确率/% | 压缩后准确率下降百分点 | |
---|---|---|---|---|
原始 | 压缩后 | |||
CLIP-Q | 14.90 | 76.15 | 73.77 | 2.38 |
HAQ | 15.20 | 76.15 | 70.63 | 5.52 |
DC | 15.18 | 76.15 | 68.90 | 7.25 |
BGD | 15.20 | 76.15 | 74.81 | 1.34 |
PQF | 16.82 | 76.15 | 75.42 | 0.73 |
QTD | 16.48 | 76.15 | 75.26 | 0.89 |
BGD | 25.91 | 76.15 | 71.53 | 4.62 |
PQF | 25.93 | 76.15 | 73.64 | 2.51 |
QTD | 23.54 | 76.15 | 71.75 | 4.40 |
PQF | 29.37 | 76.15 | 70.22 | 5.93 |
QTD | 30.16 | 76.15 | 70.74 | 5.41 |
PQF | 32.16 | 76.15 | 69.13 | 7.02 |
QTD | 33.57 | 76.15 | 69.85 | 6.30 |
实验序号 | 方法 | 压缩比 | 压缩后 准确率/% | 压缩后准确率下降百分点 |
---|---|---|---|---|
1 | 全局TKD | 28.69 | 90.24 | 4.85 |
2 | VQ | 28.48 | 92.35 | 2.74 |
3 | QTD w/o排列 | 29.58 | 93.29 | 1.80 |
4 | QTD | 29.58 | 94.02 | 1.07 |
Tab. 6 Ablation study results
实验序号 | 方法 | 压缩比 | 压缩后 准确率/% | 压缩后准确率下降百分点 |
---|---|---|---|---|
1 | 全局TKD | 28.69 | 90.24 | 4.85 |
2 | VQ | 28.48 | 92.35 | 2.74 |
3 | QTD w/o排列 | 29.58 | 93.29 | 1.80 |
4 | QTD | 29.58 | 94.02 | 1.07 |
1 | LeCUN Y, BENGIO Y, HINTON G. Deep learning [J]. Nature, 2015, 521(7553): 436-444. |
2 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2012: 1097-1105. |
3 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2014-09-04) [2023-07-01]. . |
4 | GIRSHICK R. Fast R-CNN [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. |
5 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. |
6 | MIKOLOV T, CHEN K, CORADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. (2013-01-16) [2023-07-01]. . |
7 | DEVLIN J, CHANG M-W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2018-10-11) [2023-07-01]. . |
8 | 雷杰,高鑫,宋杰,等.深度网络模型压缩综述[J].软件学报, 2018, 29(2): 251-266. |
LEI J, GAO X, SONG J, et al. Survey of deep neural network model compression [J]. Journal of Software, 2018, 29(2): 251-266. | |
9 | DENIL M, SHAKIBI B, DINH L, et al. Predicting parameters in deep learning [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2013: 2148-2156. |
10 | SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting [J]. The Journal of Machine Learning Research, 2014, 15: 1929-1958. |
11 | KIM Y-D, PARK E, YOO S, et al. Compression of deep convolutional neural networks for fast and low power mobile applications [EB/OL]. (2015-11-20) [2023-07-01]. . |
12 | LEBEDEV V, GANIN Y, RAKHUBA M, et al. Speeding-up convolutional neural networks using fine-tuned CP-decomposition [EB/OL]. (2014-12-19) [2023-07-01]. . |
13 | YIN M, SUI Y, LIAO S, et al. Towards efficient tensor decomposition-based DNN model compression with optimization framework [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10674-10683. |
14 | STOCK P, JOULIN A, GRIBONVAL R, et al. And the bit goes down: revisiting the quantization of neural networks [EB/OL]. (2019-07-12) [2023-07-01]. . |
15 | MARTINEZ J, SHEWAKRAMANI J, LIU TW, et al. Permute, quantize, and fine-tune: efficient compression of neural networks [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 15694-15703. |
16 | LIU Y, NG M K. Deep neural network compression by Tucker decomposition with nonlinear response [J]. Knowledge-Based Systems, 2022, 241: 108171. |
17 | PAN Y, XU J, WANG M, et al. Compressing recurrent neural networks with tensor ring for action recognition [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 4683-4690. |
18 | VANHOUCKE V, SENIOR A, MAO M Z. Improving the speed of neural networks on CPUs [EB/OL]. (2022-02-15) [2023-07-01]. . |
19 | ZHU C, HAN S, MAO H, et al. Trained ternary quantization [EB/OL]. (2016-12-04) [2023-07-01]. . |
20 | COURBARIAUX M, BENGIO Y, J-P DAVID. BinaryConnect: training deep neural networks with binary weights during propagations [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015, 2: 3123-3131. |
21 | WANG K, LIU Z, LIN Y, et al. HAQ: hardware-aware automated quantization with mixed precision [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 8604-8612. |
22 | GONG Y, LIU L, YANG M, et al. Compressing deep convolutional networks using vector quantization [EB/OL]. (2014-12-18) [2023-07-01]. . |
23 | WEN W, WU C, WANG Y, et al. Learning structured sparsity in deep neural networks [C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2016: 2082-2090. |
24 | ZHOU H, ALVAREZ J M, PORIKLI F. Less is more: towards compact CNNs [C]// Proceedings of the 14th European Conference on Computer Vision.Cham: Springer, 2016: 662-677. |
25 | LI Y, GU S, MAYER C, et al. Group sparsity: the hinge between filter pruning and decomposition for network compression [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8015-8024. |
26 | 王忠锋,徐志远,宋纯贺,等.基于梯度的深度网络剪枝算法[J].计算机应用, 2020, 40(5): 1253-1259. |
WANG Z F, XU Z Y, SONG C H, et al. Gradient-based deep network pruning algorithm [J]. Journal of Computer Applications, 2020, 40(5): 1253-1259. | |
27 | 巩凯强,张春梅,曾光华.卷积神经网络模型剪枝结合张量分解压缩方法[J].计算机应用, 2020, 40(11): 3146-3151. |
GONG K Q, ZHANG C M, ZENG G H. Convolution neural network model compression method based on pruning and tensor decomposition [J]. Journal of Computer Applications, 2020, 40(11): 3146-3151. | |
28 | WU B, WAN A, LANDOLA F, et al. SqueezeDet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2017: 446-454. |
29 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [EB/OL]. (2017-04-17) [2023-07-01]. . |
30 | SZEGEDY C, IOFFE S, VANHOUCKE V. Inception-v4, inception-ResNet and the impact of residual connections on learning [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 4278-4284. |
31 | CHEN P, LIU S, ZHAO H, et al. Distilling knowledge via knowledge review [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 5006-5015. |
32 | YANG X, YE J, WANG X. Factorizing knowledge in neural networks [C]// Proceedings of the 17th European Conference on Computer Vision. Cham: Springer, 2022: 73-91. |
33 | LIN S, XIE H, WANG B, et al. Knowledge distillation via the target-aware transformer [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10905-10914. |
34 | KRIZHEVSKY A, NARI V, HINTON G. CIFAR-10 and CIFAR-100 Sdatasets [DS/OL]. (2020-10-28) [2023-07-01]. . |
35 | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. |
36 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
37 | LIN X, ZHAO C, PAN W. Towards accurate binary convolutional neural network [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 344-352. |
38 | HAN S, MAO H, DALLY W J. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding [EB/OL]. (2015-10-01) [2023-07-01]. . |
39 | SHAYER O, LEVI D, FETAYA E. Learning discrete weights using the local reparameterization trick [EB/OL]. (2017-10-21) [2023-07-01]. . |
40 | RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks [C]// Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 525-542. |
41 | TUNG F, MORI G. Deep neural network compression by in-parallel pruning-quantization [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(3): 568-579. |
[1] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. |
[2] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[3] | Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242. |
[4] | Feiyu ZHAI, Handa MA. Hybrid classical-quantum classification model based on DenseNet [J]. Journal of Computer Applications, 2024, 44(6): 1905-1910. |
[5] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. |
[6] | Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759. |
[7] | Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478. |
[8] | Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545. |
[9] | Jie WANG, Hua MENG. Image classification algorithm based on overall topological structure of point cloud [J]. Journal of Computer Applications, 2024, 44(4): 1107-1113. |
[10] | Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(4): 1114-1120. |
[11] | Lijun XU, Hui LI, Zuyang LIU, Kansong CHEN, Weixuan MA. 3D-GA-Unet: MRI image segmentation algorithm for glioma based on 3D-Ghost CNN [J]. Journal of Computer Applications, 2024, 44(4): 1294-1302. |
[12] | Bin XIAO, Mo YANG, Min WANG, Guangyuan QIN, Huan LI. Domain generalization method of phase-frequency fusion from independent perspective [J]. Journal of Computer Applications, 2024, 44(4): 1002-1009. |
[13] | Ruifeng HOU, Pengcheng ZHANG, Liyuan ZHANG, Zhiguo GUI, Yi LIU, Haowen ZHANG, Shubin WANG. Iterative denoising network based on total variation regular term expansion [J]. Journal of Computer Applications, 2024, 44(3): 916-921. |
[14] | Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708. |
[15] | Xue LI, Guangle YAO, Honghui WANG, Jun LI, Haoran ZHOU, Shaoze YE. Remote sensing image classification based on sample incremental learning [J]. Journal of Computer Applications, 2024, 44(3): 732-736. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||