《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3411-3417.DOI: 10.11772/j.issn.1001-9081.2022101596
• 人工智能 • 上一篇
收稿日期:
2022-10-26
修回日期:
2023-04-03
接受日期:
2023-04-06
发布日期:
2023-05-24
出版日期:
2023-11-10
通讯作者:
薛晓
作者简介:
文凯(1972—),男,重庆人,高级工程师,博士,主要研究方向:移动通信、计算机视觉Received:
2022-10-26
Revised:
2023-04-03
Accepted:
2023-04-06
Online:
2023-05-24
Published:
2023-11-10
Contact:
Xiao XUE
About author:
WEN Kai, born in 1972, Ph. D., senior engineer. His research interests include mobile communication, computer vision.摘要:
针对胶囊网络(CapsNet)在处理含有背景噪声信息的复杂图像时分类效果不佳且计算开销大的问题,提出一种基于注意力机制和权值共享的改进胶囊网络模型——共享转换矩阵胶囊网络(STM-CapsNet)。该模型主要包括以下改进:1)在特征提取层中引入注意力模块,使低层胶囊能够聚焦于与分类任务相关的实体特征;2)将空间位置接近的低层胶囊分为若干组,每组内的低层胶囊通过共享转换矩阵映射到高层胶囊,降低计算开销,提高模型鲁棒性;3)在间隔损失与重构损失的基础上加入L2正则化项,防止模型过拟合。在CIFAR10、SVHN(Street View House Number)、FashionMNIST复杂图像数据集上的实验结果表明,各改进均能有效提升模型性能;当迭代次数为3,共享转换矩阵数为5时,STM-CapsNet模型的平均准确率分别为85.26%、93.17%、94.96%,平均参数量为8.29 MB,比基线模型的综合性能更优。
中图分类号:
文凯, 薛晓, 季娟. 面向复杂图像分类的共享转换矩阵胶囊网络[J]. 计算机应用, 2023, 43(11): 3411-3417.
Kai WEN, Xiao XUE, Juan JI. Shared transformation matrix capsule network for complex image classification[J]. Journal of Computer Applications, 2023, 43(11): 3411-3417.
r | CIFAR10 | SVHN | FashionMNIST |
---|---|---|---|
1 | 50.23 | 47.87 | 64.76 |
2 | 65.11 | 61.13 | 72.73 |
3 | 83.41 | 90.25 | 87.46 |
4 | 71.14 | 72.52 | 74.94 |
5 | 75.81 | 74.59 | 73.24 |
6 | 73.47 | 65.45 | 60.26 |
7 | 68.76 | 62.83 | 53.27 |
8 | 65.48 | 67.84 | 54.53 |
9 | 69.63 | 54.91 | 57.17 |
10 | 62.72 | 57.11 | 49.23 |
表1 不同路由次数下的准确率对比 ( %)
Tab. 1 Accuracy under different routing iterations
r | CIFAR10 | SVHN | FashionMNIST |
---|---|---|---|
1 | 50.23 | 47.87 | 64.76 |
2 | 65.11 | 61.13 | 72.73 |
3 | 83.41 | 90.25 | 87.46 |
4 | 71.14 | 72.52 | 74.94 |
5 | 75.81 | 74.59 | 73.24 |
6 | 73.47 | 65.45 | 60.26 |
7 | 68.76 | 62.83 | 53.27 |
8 | 65.48 | 67.84 | 54.53 |
9 | 69.63 | 54.91 | 57.17 |
10 | 62.72 | 57.11 | 49.23 |
数据集 | e-squash | 正则化项L2 | CV-CapsNe | DA-CapsNet | ||
---|---|---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |||
CIFAR10 | 87.29 | 89.31 | ||||
√ | 88.13 | 83.94 | 90.01 | 85.72 | ||
√ | 88.59 | 84.76 | 87.49 | 86.17 | ||
√ | √ | 89.26 | 85.48 | 89.21 | 88.20 | |
SVHN | 91.35 | 90.25 | 93.42 | 94.16 | ||
√ | 90.95 | 93.87 | 93.25 | |||
√ | 92.05 | 91.35 | 94.17 | |||
√ | √ | 92.39 | 90.83 | 94.68 | 94.24 | |
FashionMNIST | 92.45 | 94.12 | 93.98 | |||
√ | 92.57 | 90.14 | 94.37 | 92.19 | ||
√ | 93.04 | 89.43 | 92.14 | |||
√ | √ | 92.68 | 91.81 | 94.70 | 94.12 |
表2 消融实验准确率对比 ( %)
Tab. 2 Comparison of accuracy of ablation experiments
数据集 | e-squash | 正则化项L2 | CV-CapsNe | DA-CapsNet | ||
---|---|---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |||
CIFAR10 | 87.29 | 89.31 | ||||
√ | 88.13 | 83.94 | 90.01 | 85.72 | ||
√ | 88.59 | 84.76 | 87.49 | 86.17 | ||
√ | √ | 89.26 | 85.48 | 89.21 | 88.20 | |
SVHN | 91.35 | 90.25 | 93.42 | 94.16 | ||
√ | 90.95 | 93.87 | 93.25 | |||
√ | 92.05 | 91.35 | 94.17 | |||
√ | √ | 92.39 | 90.83 | 94.68 | 94.24 | |
FashionMNIST | 92.45 | 94.12 | 93.98 | |||
√ | 92.57 | 90.14 | 94.37 | 92.19 | ||
√ | 93.04 | 89.43 | 92.14 | |||
√ | √ | 92.68 | 91.81 | 94.70 | 94.12 |
N | 准确率 | 参数量/MB | N | 准确率 | 参数量/MB |
---|---|---|---|---|---|
1 | 0.513 | 4.37 | 6 | 0.819 | 17.26 |
2 | 0.716 | 7.26 | 7 | 0.851 | 19.81 |
3 | 0.812 | 10.83 | 8 | 0.845 | 22.68 |
4 | 0.824 | 13.57 | 9 | 0.855 | 26.14 |
5 | 0.841 | 16.35 | 10 | 0.860 | 31.65 |
表3 N值对模型性能的影响
Tab. 3 Effect of Nvalue on model performance
N | 准确率 | 参数量/MB | N | 准确率 | 参数量/MB |
---|---|---|---|---|---|
1 | 0.513 | 4.37 | 6 | 0.819 | 17.26 |
2 | 0.716 | 7.26 | 7 | 0.851 | 19.81 |
3 | 0.812 | 10.83 | 8 | 0.845 | 22.68 |
4 | 0.824 | 13.57 | 9 | 0.855 | 26.14 |
5 | 0.841 | 16.35 | 10 | 0.860 | 31.65 |
模型 | 分类准确率/% | 平均 参数量/MB | ||
---|---|---|---|---|
CIFAR10 | FashionMNIST | SVHN | ||
DA-CapsNet[ | 85.47 | 91.98 | 94.82 | 11.75 |
CV-CapsNet++[ | 84.21 | 92.79 | 91.35 | 20.69 |
Quick-CapsNet[ | 67.18 | 88.84 | 86.20 | 9.81 |
DenseCapsNet[ | 86.17 | 84.39 | 93.18 | 1.64 |
HitNet[ | 73.30 | — | — | — |
MLcn2[ | 75.18 | 92.63 | 89.78 | 10.63 |
MS-CapsNet[ | 75.70 | 92.70 | 90.34 | 10.89 |
STMCapsNet | 85.26 | 93.17 | 94.96 | 8.29 |
表4 不同胶囊网络的分类准确率与平均参数量对比
Tab. 4 Comparison of classification accuracy and average parameter numbers among different capsule networks
模型 | 分类准确率/% | 平均 参数量/MB | ||
---|---|---|---|---|
CIFAR10 | FashionMNIST | SVHN | ||
DA-CapsNet[ | 85.47 | 91.98 | 94.82 | 11.75 |
CV-CapsNet++[ | 84.21 | 92.79 | 91.35 | 20.69 |
Quick-CapsNet[ | 67.18 | 88.84 | 86.20 | 9.81 |
DenseCapsNet[ | 86.17 | 84.39 | 93.18 | 1.64 |
HitNet[ | 73.30 | — | — | — |
MLcn2[ | 75.18 | 92.63 | 89.78 | 10.63 |
MS-CapsNet[ | 75.70 | 92.70 | 90.34 | 10.89 |
STMCapsNet | 85.26 | 93.17 | 94.96 | 8.29 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 10.1145/3065386 |
2 | FAN Y, LI Y, WANG S, et al. Application of YOLOv5 neural network based on improved attention mechanism in recognition of Thangka image defects[J]. KSII Transactions on Internet and Information Systems, 2022, 16(1): 245-265. 10.3837/tiis.2022.01.014 |
3 | CAI J, LI J, LI W, et al. Deep learning model used in text classification[C]// Proceedings of the 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing. Piscataway: IEEE, 2018: 123-126. 10.1109/iccwamtip.2018.8632592 |
4 | RATNER A J, EHRENBERG H R, HUSSAIN Z, et al. Learning to compose domain-specific transformations for data augmentation[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 3239-3249. |
5 | SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 3859-3869. |
6 | AFSHAR P, MOHAMMADI A, PLATANIOTIS K N. Brain tumor type classification via capsule networks[C]// Proceedings of the 2018 25th IEEE International Conference on Image Processing. Piscataway: IEEE, 2018: 3124-3128. 10.1109/icip.2018.8451379 |
7 | MUKHOMETZIANOV R, CARRILLO, J. CapsNet comparative performance evaluation for image classification [EB/OL]. [2022-10-15]. . |
8 | WANG K, HE R, WANG S, et al. The efficient-CapsNet model for facial expression recognition[J]. Applied Intelligence, 2023, 53: 16367-16380. 10.1007/s10489-022-04349-8 |
9 | HUANG W, ZHOU F. DA-CapsNet: dual attention mechanism capsule network[J]. Scientific Reports, 2020,10(1): Article No. 11383. 10.1038/s41598-020-68453-w |
10 | JIA X, LI J, ZHAO B, et al. Res-CapsNet: residual capsule network for data classification[J]. Neural Processing Letters, 2022, 54: 4229-4245. 10.1007/s11063-022-10806-9 |
11 | CHENG X, HE J, HEA J, et al. Cv-CapsNet: complex-valued capsule network[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(2): 829-839. |
12 | MOBINY A, VAN NGUYEN H. Fast CapsNet for lung cancer screening[C]// Proceedings of the 2018 21st International Conference on Medical Image Computing and Computer Assisted Intervention, LNIP 11071. Cham: Springer, 2018: 706-714. |
13 | RAJASEGARAN J, JAYASUNDARA V, JAYASEKARA S, et al. DeepCaps: going deeper with capsule networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 10728-10737. 10.1109/cvpr.2019.01098 |
14 | SHIRI P, SHARIFI R, BANIASADI A. Quick-CapsNet (QCN): a fast alternative to capsule networks[C]// Proceedings of the 2020 IEEE/ACS 17th International Conference on Computer Systems and Applications. Piscataway: IEEE, 2020: 1-8. 10.1109/aiccsa50499.2020.9316525 |
15 | LI X, WANG L. β-CapsNet: learning disentangled representation for CapsNet by information bottleneck[J]. Neural Computing and Applications, 2022, 33(1): 1-13. |
16 | 尹春勇,何苗.基于改进胶囊网络的文本分类[J].计算机应用,2020,40(9):2525-2530. 10.11772/j.issn.1001-9081.2019122153 |
YIN C Y, HE M. Text classification based on improved capsule network[J]. Journal of Computer Applications, 2020, 40(9): 2525-2530. 10.11772/j.issn.1001-9081.2019122153 | |
17 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
18 | 任晓丽,李晓青,闫雨寒,等.注意力机制及其在医学视觉任务中的作用研究[J].影像技术,2023,35(1):76-80. 10.3969/j.issn.1001-0270.2023.01.14 |
REN X L, LI X Q, YAN Y H, et al. Study on attention mechanism and its role in medical visual task[J]. Image Technology. 2023, 35(1): 76-80. 10.3969/j.issn.1001-0270.2023.01.14 | |
19 | KRIZHEVSKY A, HINTON G E. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): 1201-1208. |
20 | XIAO H, RASUL K, VOLLGRAF R, et al. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms [EB/OL]. [2022-10-15]. . |
21 | WEI X, YANG F, WU C. Deep residual networks of residual networks for image super-resolution[C]// Proceedings of the 2017 LIDAR Imaging Detection & Target Recognition, SPIE 10605. Bellingham, WA: SPIE, 2017: 1132-1140. |
22 | SUN K, WEN X, YUAN L, et al. Dense capsule networks with fewer parameters[J]. Soft Computing, 2021, 25(10): 6927-6945. 10.1007/s00500-021-05774-6 |
23 | DELIÈGE A, CIOPPA A, DROOGENBROECK M V. HitNet: a neural network with capsules embedded in a Hit-or-Miss layer, extended with hybrid data augmentation and ghost capsules [EB/OL]. [2022-10-15]. . 10.48550/arXiv.1806.06519 |
24 | ROSARIO V M D, BORIN E, BRETERNITZ M, Jr. The Multi-Lane Capsule Network (MLCN)[J]. IEEE Signal Processing Letters, 2019, 26(7): 1006-1010. 10.1109/lsp.2019.2915661 |
25 | XIANG C, LU Z, ZOU W, et al. MS-CapsNet: a novel multi-scale capsule network[J]. IEEE Signal Processing Letters, 2018, 25(12):1850-1854. 10.1109/lsp.2018.2873892 |
26 | ZHOU D, KANG B, JIN X, et al. DeepViT: towards deeper vision transformer [EB/OL]. [2022-10-15]. . |
[1] | 杨昊, 张轶. 基于上下文信息和多尺度融合重要性感知的特征金字塔网络算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2727-2734. |
[2] | 张涵钰, 李振波, 李蔚然, 杨普. 基于机器视觉的水产养殖计数研究综述[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2970-2982. |
[3] | 袁国龙, 张玉金, 刘洋. 基于残差反馈和自注意力的图像篡改取证网络[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2925-2931. |
[4] | 陈俊韬, 朱子奇. 基于多尺度特征提取与融合的图像复制-粘贴伪造检测[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2919-2924. |
[5] | 王宏, 钱清, 王欢, 龙永. 融合大核注意力卷积的轻量化图像篡改定位算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2692-2699. |
[6] | 李众, 王雅婧, 马巧梅. 基于空洞卷积的医学图像超分辨率重建算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2940-2947. |
[7] | 何子仪, 杨燕, 张熠玲. 深度融合多视图聚类网络[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2651-2656. |
[8] | 张秋余, 温永旺. 用于语音检索的三联体深度哈希方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2910-2918. |
[9] | 郭祥, 姜文刚, 王宇航. 基于改进Inception-ResNet的加密流量分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2471-2476. |
[10] | 崔雨萌, 王靖亚, 刘晓文, 闫尚义, 陶知众. 融合注意力和裁剪机制的通用文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2396-2405. |
[11] | 齐爱玲, 王宣淋. 基于中层细微特征提取与多尺度特征融合细粒度图像识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2556-2563. |
[12] | 张琨, 杨丰玉, 钟发, 曾广东, 周世健. 基于混合代码表示的源代码脆弱性检测[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2517-2526. |
[13] | 金泽熙, 李磊, 刘继. 基于改进领域分离网络的迁移学习模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2382-2389. |
[14] | 王静红, 周志霞, 王辉, 李昊康. 双路自编码器的属性网络表示学习[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2338-2344. |
[15] | 刘源, 董永权, 贾瑞, 杨昊霖. 面向个性化课程推荐的分层分期注意力网络模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2358-2363. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||