《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3411-3417.DOI: 10.11772/j.issn.1001-9081.2022101596

• 人工智能 • 上一篇    

面向复杂图像分类的共享转换矩阵胶囊网络

文凯, 薛晓(), 季娟   

  1. 重庆邮电大学 通信与信息工程学院,重庆 401520
  • 收稿日期:2022-10-26 修回日期:2023-04-03 接受日期:2023-04-06 发布日期:2023-05-24 出版日期:2023-11-10
  • 通讯作者: 薛晓
  • 作者简介:文凯(1972—),男,重庆人,高级工程师,博士,主要研究方向:移动通信、计算机视觉
    薛晓(1996—),男,山西运城人,硕士研究生,主要研究方向:图像分类、目标检测1464090345@qq.com
    季娟(1998—),女,四川广安人,硕士研究生,主要研究方向:图像去噪、图像分割。

Shared transformation matrix capsule network for complex image classification

Kai WEN, Xiao XUE(), Juan JI   

  1. College of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 401520,China
  • Received:2022-10-26 Revised:2023-04-03 Accepted:2023-04-06 Online:2023-05-24 Published:2023-11-10
  • Contact: Xiao XUE
  • About author:WEN Kai, born in 1972, Ph. D., senior engineer. His research interests include mobile communication, computer vision.
    XUE Xiao, born in 1996, M. S. candidate. His research interests include image classification, target detection.
    JI Juan, born in 1998, M. S. candidate. Her research interests include image denoising, image segmentation.

摘要:

针对胶囊网络(CapsNet)在处理含有背景噪声信息的复杂图像时分类效果不佳且计算开销大的问题,提出一种基于注意力机制和权值共享的改进胶囊网络模型——共享转换矩阵胶囊网络(STM-CapsNet)。该模型主要包括以下改进:1)在特征提取层中引入注意力模块,使低层胶囊能够聚焦于与分类任务相关的实体特征;2)将空间位置接近的低层胶囊分为若干组,每组内的低层胶囊通过共享转换矩阵映射到高层胶囊,降低计算开销,提高模型鲁棒性;3)在间隔损失与重构损失的基础上加入L2正则化项,防止模型过拟合。在CIFAR10、SVHN(Street View House Number)、FashionMNIST复杂图像数据集上的实验结果表明,各改进均能有效提升模型性能;当迭代次数为3,共享转换矩阵数为5时,STM-CapsNet模型的平均准确率分别为85.26%、93.17%、94.96%,平均参数量为8.29 MB,比基线模型的综合性能更优。

关键词: 胶囊网络, 图像分类, 注意力机制, 共享转换矩阵, 深度学习

Abstract:

Concerning the problems of poor classification performance and high computational overhead of Capsule Network (CapsNet) on complex images with background noise information, an improved capsule network model based on attention mechanism and weight sharing was proposed, called Shared Transformation Matrix CapsNet (STM-CapsNet). The proposed model mainly includes the following improvement. 1) An attention module was introduced into the feature extraction layer of CapsNet, which enabled low-level capsules to focus on entity features related to the classification task. 2) Low-level capsules with close spatial positions were divided into several groups, and each group of low-level capsules was mapped to high-level capsules by sharing transformation matrices, which reduced computational overhead and improved model robustness. 3) The L2 regularization term was added to margin loss and reconstruction loss to prevent model overfitting. Experimental results on three complex image datasets including CIFAR10, SVHN (Street View House Number) and FashionMNIST show that, the above improvements are effective in enhacing the model performance; when the number of iterations is 3, and the number of shared transformation matrices is 5, the average accuracies of STM-CapsNet are 85.26%, 93.17% and 94.96% respectively, the average parameter amount is 8.29 MB, verifying that STM-CapsNet has better performance compared with the baseline models.

Key words: Capsule Network (CapsNet), image classification, attention mechanism, shared transformation matrix, deep learning

中图分类号: