Journal of Computer Applications
Next Articles
Received:
Revised:
Online:
Published:
张一心,蒋林,李远成,纪辰
通讯作者:
基金资助:
Abstract: Abstract: To address the limitations of memory access overhead, redundant computation, and efficient deployment caused by the large parameter size of convolutional neural networks, a CNN joint compression method for reconfigurable structures was proposed. Network characteristics and hardware deployment requirements were integrated, and pruning and quantization were jointly optimized. A convolutional layer pruning strategy based on feature similarity was introduced, in which feature evaluation, clustering, similarity measurement, and redundancy removal were performed to identify low-contribution and redundant filters, while progressive threshold pruning was applied to fully connected layers. Layer sensitivity indices were constructed using Hessian traces, and layer-wise precision was adaptively allocated under a bit-width budget. An optimized deployment scheme was further designed for reconfigurable structures. Experiments on the CIFAR-10 dataset showed that compression ratios of 16.2× and 8.38× were achieved for VGG16 and ResNet18, respectively, surpassing the 13.9× ratio of APQ. Relative to a pruning baseline using fixed 16-bit precision, the inference latency of pruned VGG16 on a self reconfigurable and evolvable AI chip was reduced from 23.3 ms to 9.1 ms after applying the deployment scheme, achieving a 2.56× speedup. Storage and transmission costs were reduced while classification accuracy was maintained, and deployment efficiency and computational performance on edge devices were improved.
Key words: Keywords: reconfigurable structure, convolutional neural network(CNN), model compression, structured pruning, adaptive quantization, artificial intelligence chip
摘要: 针对卷积神经网络参数规模大导致的访存开销、冗余计算及高效部署受限等问题,提出一种面向可重构结构的CNN联合压缩方法,结合网络结构特性与硬件部署需求,从剪枝与量化两维度协同优化。首先,提出基于特征相似的卷积层剪枝策略,依次经过特征信息评估、聚类分组、相似度计算和冗余筛选,筛选低贡献及冗余滤波器;在全连接层采用渐进式阈值剪枝压缩冗余权重;其次,量化部分利用Hessian迹构建层敏感度指标,在位宽预算下自适应分配各层精度;最后,结合可重构结构特性,提出优化部署方案。实验结果表明,在CIFAR-10数据集上,本文方法对VGG16与ResNet18分别实现16.2x和8.38x的压缩,相较于APQ的13.9x具备更高压缩比;与固定16bit精度的剪枝基线相比,采用本文部署策略后,剪枝后VGG16在自重构自演化AI芯片推理时延由23.3ms降至9.1ms,加速比达到2.56x。所提方法在保证分类准确率的同时降低存储与传输开销,提升了边缘设备部署效率与计算性能。
关键词: 可重构结构, 卷积神经网络, 模型压缩, 结构化剪枝, 自适应量化, 人工智能芯片
CLC Number:
TN409
TP183
张一心 蒋林 李远成 纪辰. 面向可重构结构的CNN剪枝与量化压缩方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025081055.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025081055