《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3411-3417.DOI: 10.11772/j.issn.1001-9081.2022101596
所属专题: 人工智能
收稿日期:
2022-10-26
修回日期:
2023-04-03
接受日期:
2023-04-06
发布日期:
2023-05-24
出版日期:
2023-11-10
通讯作者:
薛晓
作者简介:
文凯(1972—),男,重庆人,高级工程师,博士,主要研究方向:移动通信、计算机视觉Received:
2022-10-26
Revised:
2023-04-03
Accepted:
2023-04-06
Online:
2023-05-24
Published:
2023-11-10
Contact:
Xiao XUE
About author:
WEN Kai, born in 1972, Ph. D., senior engineer. His research interests include mobile communication, computer vision.摘要:
针对胶囊网络(CapsNet)在处理含有背景噪声信息的复杂图像时分类效果不佳且计算开销大的问题,提出一种基于注意力机制和权值共享的改进胶囊网络模型——共享转换矩阵胶囊网络(STM-CapsNet)。该模型主要包括以下改进:1)在特征提取层中引入注意力模块,使低层胶囊能够聚焦于与分类任务相关的实体特征;2)将空间位置接近的低层胶囊分为若干组,每组内的低层胶囊通过共享转换矩阵映射到高层胶囊,降低计算开销,提高模型鲁棒性;3)在间隔损失与重构损失的基础上加入L2正则化项,防止模型过拟合。在CIFAR10、SVHN(Street View House Number)、FashionMNIST复杂图像数据集上的实验结果表明,各改进均能有效提升模型性能;当迭代次数为3,共享转换矩阵数为5时,STM-CapsNet模型的平均准确率分别为85.26%、93.17%、94.96%,平均参数量为8.29 MB,比基线模型的综合性能更优。
中图分类号:
文凯, 薛晓, 季娟. 面向复杂图像分类的共享转换矩阵胶囊网络[J]. 计算机应用, 2023, 43(11): 3411-3417.
Kai WEN, Xiao XUE, Juan JI. Shared transformation matrix capsule network for complex image classification[J]. Journal of Computer Applications, 2023, 43(11): 3411-3417.
r | CIFAR10 | SVHN | FashionMNIST |
---|---|---|---|
1 | 50.23 | 47.87 | 64.76 |
2 | 65.11 | 61.13 | 72.73 |
3 | 83.41 | 90.25 | 87.46 |
4 | 71.14 | 72.52 | 74.94 |
5 | 75.81 | 74.59 | 73.24 |
6 | 73.47 | 65.45 | 60.26 |
7 | 68.76 | 62.83 | 53.27 |
8 | 65.48 | 67.84 | 54.53 |
9 | 69.63 | 54.91 | 57.17 |
10 | 62.72 | 57.11 | 49.23 |
表1 不同路由次数下的准确率对比 ( %)
Tab. 1 Accuracy under different routing iterations
r | CIFAR10 | SVHN | FashionMNIST |
---|---|---|---|
1 | 50.23 | 47.87 | 64.76 |
2 | 65.11 | 61.13 | 72.73 |
3 | 83.41 | 90.25 | 87.46 |
4 | 71.14 | 72.52 | 74.94 |
5 | 75.81 | 74.59 | 73.24 |
6 | 73.47 | 65.45 | 60.26 |
7 | 68.76 | 62.83 | 53.27 |
8 | 65.48 | 67.84 | 54.53 |
9 | 69.63 | 54.91 | 57.17 |
10 | 62.72 | 57.11 | 49.23 |
数据集 | e-squash | 正则化项L2 | CV-CapsNe | DA-CapsNet | ||
---|---|---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |||
CIFAR10 | 87.29 | 89.31 | ||||
√ | 88.13 | 83.94 | 90.01 | 85.72 | ||
√ | 88.59 | 84.76 | 87.49 | 86.17 | ||
√ | √ | 89.26 | 85.48 | 89.21 | 88.20 | |
SVHN | 91.35 | 90.25 | 93.42 | 94.16 | ||
√ | 90.95 | 93.87 | 93.25 | |||
√ | 92.05 | 91.35 | 94.17 | |||
√ | √ | 92.39 | 90.83 | 94.68 | 94.24 | |
FashionMNIST | 92.45 | 94.12 | 93.98 | |||
√ | 92.57 | 90.14 | 94.37 | 92.19 | ||
√ | 93.04 | 89.43 | 92.14 | |||
√ | √ | 92.68 | 91.81 | 94.70 | 94.12 |
表2 消融实验准确率对比 ( %)
Tab. 2 Comparison of accuracy of ablation experiments
数据集 | e-squash | 正则化项L2 | CV-CapsNe | DA-CapsNet | ||
---|---|---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |||
CIFAR10 | 87.29 | 89.31 | ||||
√ | 88.13 | 83.94 | 90.01 | 85.72 | ||
√ | 88.59 | 84.76 | 87.49 | 86.17 | ||
√ | √ | 89.26 | 85.48 | 89.21 | 88.20 | |
SVHN | 91.35 | 90.25 | 93.42 | 94.16 | ||
√ | 90.95 | 93.87 | 93.25 | |||
√ | 92.05 | 91.35 | 94.17 | |||
√ | √ | 92.39 | 90.83 | 94.68 | 94.24 | |
FashionMNIST | 92.45 | 94.12 | 93.98 | |||
√ | 92.57 | 90.14 | 94.37 | 92.19 | ||
√ | 93.04 | 89.43 | 92.14 | |||
√ | √ | 92.68 | 91.81 | 94.70 | 94.12 |
N | 准确率 | 参数量/MB | N | 准确率 | 参数量/MB |
---|---|---|---|---|---|
1 | 0.513 | 4.37 | 6 | 0.819 | 17.26 |
2 | 0.716 | 7.26 | 7 | 0.851 | 19.81 |
3 | 0.812 | 10.83 | 8 | 0.845 | 22.68 |
4 | 0.824 | 13.57 | 9 | 0.855 | 26.14 |
5 | 0.841 | 16.35 | 10 | 0.860 | 31.65 |
表3 N值对模型性能的影响
Tab. 3 Effect of Nvalue on model performance
N | 准确率 | 参数量/MB | N | 准确率 | 参数量/MB |
---|---|---|---|---|---|
1 | 0.513 | 4.37 | 6 | 0.819 | 17.26 |
2 | 0.716 | 7.26 | 7 | 0.851 | 19.81 |
3 | 0.812 | 10.83 | 8 | 0.845 | 22.68 |
4 | 0.824 | 13.57 | 9 | 0.855 | 26.14 |
5 | 0.841 | 16.35 | 10 | 0.860 | 31.65 |
模型 | 分类准确率/% | 平均 参数量/MB | ||
---|---|---|---|---|
CIFAR10 | FashionMNIST | SVHN | ||
DA-CapsNet[ | 85.47 | 91.98 | 94.82 | 11.75 |
CV-CapsNet++[ | 84.21 | 92.79 | 91.35 | 20.69 |
Quick-CapsNet[ | 67.18 | 88.84 | 86.20 | 9.81 |
DenseCapsNet[ | 86.17 | 84.39 | 93.18 | 1.64 |
HitNet[ | 73.30 | — | — | — |
MLcn2[ | 75.18 | 92.63 | 89.78 | 10.63 |
MS-CapsNet[ | 75.70 | 92.70 | 90.34 | 10.89 |
STMCapsNet | 85.26 | 93.17 | 94.96 | 8.29 |
表4 不同胶囊网络的分类准确率与平均参数量对比
Tab. 4 Comparison of classification accuracy and average parameter numbers among different capsule networks
模型 | 分类准确率/% | 平均 参数量/MB | ||
---|---|---|---|---|
CIFAR10 | FashionMNIST | SVHN | ||
DA-CapsNet[ | 85.47 | 91.98 | 94.82 | 11.75 |
CV-CapsNet++[ | 84.21 | 92.79 | 91.35 | 20.69 |
Quick-CapsNet[ | 67.18 | 88.84 | 86.20 | 9.81 |
DenseCapsNet[ | 86.17 | 84.39 | 93.18 | 1.64 |
HitNet[ | 73.30 | — | — | — |
MLcn2[ | 75.18 | 92.63 | 89.78 | 10.63 |
MS-CapsNet[ | 75.70 | 92.70 | 90.34 | 10.89 |
STMCapsNet | 85.26 | 93.17 | 94.96 | 8.29 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 10.1145/3065386 |
2 | FAN Y, LI Y, WANG S, et al. Application of YOLOv5 neural network based on improved attention mechanism in recognition of Thangka image defects[J]. KSII Transactions on Internet and Information Systems, 2022, 16(1): 245-265. 10.3837/tiis.2022.01.014 |
3 | CAI J, LI J, LI W, et al. Deep learning model used in text classification[C]// Proceedings of the 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing. Piscataway: IEEE, 2018: 123-126. 10.1109/iccwamtip.2018.8632592 |
4 | RATNER A J, EHRENBERG H R, HUSSAIN Z, et al. Learning to compose domain-specific transformations for data augmentation[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 3239-3249. |
5 | SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 3859-3869. |
6 | AFSHAR P, MOHAMMADI A, PLATANIOTIS K N. Brain tumor type classification via capsule networks[C]// Proceedings of the 2018 25th IEEE International Conference on Image Processing. Piscataway: IEEE, 2018: 3124-3128. 10.1109/icip.2018.8451379 |
7 | MUKHOMETZIANOV R, CARRILLO, J. CapsNet comparative performance evaluation for image classification [EB/OL]. [2022-10-15]. . |
8 | WANG K, HE R, WANG S, et al. The efficient-CapsNet model for facial expression recognition[J]. Applied Intelligence, 2023, 53: 16367-16380. 10.1007/s10489-022-04349-8 |
9 | HUANG W, ZHOU F. DA-CapsNet: dual attention mechanism capsule network[J]. Scientific Reports, 2020,10(1): Article No. 11383. 10.1038/s41598-020-68453-w |
10 | JIA X, LI J, ZHAO B, et al. Res-CapsNet: residual capsule network for data classification[J]. Neural Processing Letters, 2022, 54: 4229-4245. 10.1007/s11063-022-10806-9 |
11 | CHENG X, HE J, HEA J, et al. Cv-CapsNet: complex-valued capsule network[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(2): 829-839. |
12 | MOBINY A, VAN NGUYEN H. Fast CapsNet for lung cancer screening[C]// Proceedings of the 2018 21st International Conference on Medical Image Computing and Computer Assisted Intervention, LNIP 11071. Cham: Springer, 2018: 706-714. |
13 | RAJASEGARAN J, JAYASUNDARA V, JAYASEKARA S, et al. DeepCaps: going deeper with capsule networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 10728-10737. 10.1109/cvpr.2019.01098 |
14 | SHIRI P, SHARIFI R, BANIASADI A. Quick-CapsNet (QCN): a fast alternative to capsule networks[C]// Proceedings of the 2020 IEEE/ACS 17th International Conference on Computer Systems and Applications. Piscataway: IEEE, 2020: 1-8. 10.1109/aiccsa50499.2020.9316525 |
15 | LI X, WANG L. β-CapsNet: learning disentangled representation for CapsNet by information bottleneck[J]. Neural Computing and Applications, 2022, 33(1): 1-13. |
16 | 尹春勇,何苗.基于改进胶囊网络的文本分类[J].计算机应用,2020,40(9):2525-2530. 10.11772/j.issn.1001-9081.2019122153 |
YIN C Y, HE M. Text classification based on improved capsule network[J]. Journal of Computer Applications, 2020, 40(9): 2525-2530. 10.11772/j.issn.1001-9081.2019122153 | |
17 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
18 | 任晓丽,李晓青,闫雨寒,等.注意力机制及其在医学视觉任务中的作用研究[J].影像技术,2023,35(1):76-80. 10.3969/j.issn.1001-0270.2023.01.14 |
REN X L, LI X Q, YAN Y H, et al. Study on attention mechanism and its role in medical visual task[J]. Image Technology. 2023, 35(1): 76-80. 10.3969/j.issn.1001-0270.2023.01.14 | |
19 | KRIZHEVSKY A, HINTON G E. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): 1201-1208. |
20 | XIAO H, RASUL K, VOLLGRAF R, et al. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms [EB/OL]. [2022-10-15]. . |
21 | WEI X, YANG F, WU C. Deep residual networks of residual networks for image super-resolution[C]// Proceedings of the 2017 LIDAR Imaging Detection & Target Recognition, SPIE 10605. Bellingham, WA: SPIE, 2017: 1132-1140. |
22 | SUN K, WEN X, YUAN L, et al. Dense capsule networks with fewer parameters[J]. Soft Computing, 2021, 25(10): 6927-6945. 10.1007/s00500-021-05774-6 |
23 | DELIÈGE A, CIOPPA A, DROOGENBROECK M V. HitNet: a neural network with capsules embedded in a Hit-or-Miss layer, extended with hybrid data augmentation and ghost capsules [EB/OL]. [2022-10-15]. . 10.48550/arXiv.1806.06519 |
24 | ROSARIO V M D, BORIN E, BRETERNITZ M, Jr. The Multi-Lane Capsule Network (MLCN)[J]. IEEE Signal Processing Letters, 2019, 26(7): 1006-1010. 10.1109/lsp.2019.2915661 |
25 | XIANG C, LU Z, ZOU W, et al. MS-CapsNet: a novel multi-scale capsule network[J]. IEEE Signal Processing Letters, 2018, 25(12):1850-1854. 10.1109/lsp.2018.2873892 |
26 | ZHOU D, KANG B, JIN X, et al. DeepViT: towards deeper vision transformer [EB/OL]. [2022-10-15]. . |
[1] | 黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969. |
[2] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. |
[3] | 赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892. |
[4] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[5] | 王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918. |
[6] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[7] | 李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703. |
[8] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[9] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[10] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[11] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. |
[12] | 李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594. |
[13] | 莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617. |
[14] | 顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625. |
[15] | 石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||