Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (6): 2016-2024.DOI: 10.11772/j.issn.1001-9081.2024060806
• Multimedia computing and computer simulation • Previous Articles
Xiang WANG1, Qianqian CUI1, Xiaoming ZHANG1, Jianchao WANG1(), Zhenzhou WANG1, Jialin SONG2
Received:
2024-06-20
Revised:
2024-08-28
Accepted:
2024-09-03
Online:
2024-09-10
Published:
2025-06-10
Contact:
Jianchao WANG
About author:
WANG Xiang, born in 1978, Ph. D., associate professor. Her research interests include intelligent optimization algorithm, machine vision.Supported by:
王向1, 崔倩倩1, 张晓明1, 王建超1(), 王震洲1, 宋佳霖2
通讯作者:
王建超
作者简介:
王向(1978—),女,河北石家庄人,副教授,博士,主要研究方向:智能优化算法、机器视觉基金资助:
CLC Number:
Xiang WANG, Qianqian CUI, Xiaoming ZHANG, Jianchao WANG, Zhenzhou WANG, Jialin SONG. Wireless capsule endoscopy image classification model based on improved ConvNeXt[J]. Journal of Computer Applications, 2025, 45(6): 2016-2024.
王向, 崔倩倩, 张晓明, 王建超, 王震洲, 宋佳霖. 改进ConvNeXt的无线胶囊内镜图像分类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 2016-2024.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024060806
类别名 | 原始数据集实例数 | 训练集数据增强后的 实例数 | ||
---|---|---|---|---|
训练集 | 验证集 | 测试集 | ||
总计 | 2 344 | 780 | 777 | 5 279 |
幽门 | 321 | 107 | 106 | 642 |
回盲瓣 | 312 | 104 | 104 | 624 |
血管扩张 | 304 | 101 | 101 | 608 |
红斑 | 96 | 32 | 31 | 576 |
淋巴扩张 | 261 | 87 | 87 | 588 |
息肉 | 33 | 11 | 11 | 396 |
溃疡 | 273 | 91 | 90 | 602 |
侵蚀 | 304 | 101 | 101 | 608 |
正常 | 440 | 146 | 146 | 635 |
Tab. 1 Dataset distribution
类别名 | 原始数据集实例数 | 训练集数据增强后的 实例数 | ||
---|---|---|---|---|
训练集 | 验证集 | 测试集 | ||
总计 | 2 344 | 780 | 777 | 5 279 |
幽门 | 321 | 107 | 106 | 642 |
回盲瓣 | 312 | 104 | 104 | 624 |
血管扩张 | 304 | 101 | 101 | 608 |
红斑 | 96 | 32 | 31 | 576 |
淋巴扩张 | 261 | 87 | 87 | 588 |
息肉 | 33 | 11 | 11 | 396 |
溃疡 | 273 | 91 | 90 | 602 |
侵蚀 | 304 | 101 | 101 | 608 |
正常 | 440 | 146 | 146 | 635 |
实验 类型 | 模块/模型 | 准确 率/% | 平均 精确 率/% | 平均 召回 率/% | 平均 F1值/ % | 参数 量/106 |
---|---|---|---|---|---|---|
注意力 机制 对比 | CBAM | 92.54 | 92.41 | 93.82 | 93.05 | 28.35 |
ECA | 93.05 | 94.11 | 94.22 | 94.11 | 28.25 | |
CA | 93.95 | 94.35 | 94.65 | 94.47 | 28.40 | |
SimAM | 94.85 | 94.70 | 95.46 | 95.02 | 28.25 | |
与其他 模型 对比 | VGG16[ | 89.35 | 89.84 | 89.43 | 89.55 | 134.30 |
ResNet18[ | 88.93 | 88.66 | 91.13 | 89.69 | 11.18 | |
ResNet50 | 90.09 | 90.43 | 91.09 | 90.71 | 23.53 | |
ResNet101 | 91.63 | 90.55 | 93.23 | 91.70 | 42.52 | |
ResNet152[ | 91.76 | 90.81 | 92.67 | 91.61 | 58.16 | |
ConvNeXt-T | 91.89 | 90.86 | 93.24 | 91.86 | 27.83 | |
ConvNeXt-B | 93.69 | 93.36 | 93.97 | 93.62 | 87.58 | |
ViT | 91.25 | 90.51 | 92.03 | 91.09 | 88.19 | |
Swin-T | 92.79 | 91.39 | 93.75 | 92.40 | 30.53 | |
Swin-B | 94.34 | 93.41 | 95.49 | 94.35 | 86.75 | |
本文模型 | 94.85 | 94.70 | 95.46 | 95.02 | 28.25 |
Tab. 2 Comparison experimental results
实验 类型 | 模块/模型 | 准确 率/% | 平均 精确 率/% | 平均 召回 率/% | 平均 F1值/ % | 参数 量/106 |
---|---|---|---|---|---|---|
注意力 机制 对比 | CBAM | 92.54 | 92.41 | 93.82 | 93.05 | 28.35 |
ECA | 93.05 | 94.11 | 94.22 | 94.11 | 28.25 | |
CA | 93.95 | 94.35 | 94.65 | 94.47 | 28.40 | |
SimAM | 94.85 | 94.70 | 95.46 | 95.02 | 28.25 | |
与其他 模型 对比 | VGG16[ | 89.35 | 89.84 | 89.43 | 89.55 | 134.30 |
ResNet18[ | 88.93 | 88.66 | 91.13 | 89.69 | 11.18 | |
ResNet50 | 90.09 | 90.43 | 91.09 | 90.71 | 23.53 | |
ResNet101 | 91.63 | 90.55 | 93.23 | 91.70 | 42.52 | |
ResNet152[ | 91.76 | 90.81 | 92.67 | 91.61 | 58.16 | |
ConvNeXt-T | 91.89 | 90.86 | 93.24 | 91.86 | 27.83 | |
ConvNeXt-B | 93.69 | 93.36 | 93.97 | 93.62 | 87.58 | |
ViT | 91.25 | 90.51 | 92.03 | 91.09 | 88.19 | |
Swin-T | 92.79 | 91.39 | 93.75 | 92.40 | 30.53 | |
Swin-B | 94.34 | 93.41 | 95.49 | 94.35 | 86.75 | |
本文模型 | 94.85 | 94.70 | 95.46 | 95.02 | 28.25 |
模型 | 改进点 | 准确率/% | 平均精确率/% | 平均召回率/% | 平均F1值/% | 参数量/106 | ||
---|---|---|---|---|---|---|---|---|
SimAM | GC-MFF | 损失函数 | ||||||
模型1 | 91.89 | 90.86 | 93.24 | 91.86 | 27.83 | |||
模型2 | √ | 92.66 | 91.91 | 94.14 | 92.90 | 27.83 | ||
模型3 | √ | 93.05 | 92.95 | 93.37 | 93.04 | 28.25 | ||
模型4 | √ | 92.28 | 92.54 | 93.07 | 92.80 | 27.83 | ||
模型5 | √ | √ | 94.08 | 93.11 | 95.22 | 94.03 | 28.25 | |
模型6 | √ | √ | 93.56 | 93.64 | 94.61 | 94.08 | 27.83 | |
模型7 | √ | √ | 94.16 | 92.27 | 95.26 | 93.59 | 28.25 | |
模型8 | √ | √ | √ | 94.85 | 94.70 | 95.46 | 95.02 | 28.25 |
Tab. 3 Ablation experimental results
模型 | 改进点 | 准确率/% | 平均精确率/% | 平均召回率/% | 平均F1值/% | 参数量/106 | ||
---|---|---|---|---|---|---|---|---|
SimAM | GC-MFF | 损失函数 | ||||||
模型1 | 91.89 | 90.86 | 93.24 | 91.86 | 27.83 | |||
模型2 | √ | 92.66 | 91.91 | 94.14 | 92.90 | 27.83 | ||
模型3 | √ | 93.05 | 92.95 | 93.37 | 93.04 | 28.25 | ||
模型4 | √ | 92.28 | 92.54 | 93.07 | 92.80 | 27.83 | ||
模型5 | √ | √ | 94.08 | 93.11 | 95.22 | 94.03 | 28.25 | |
模型6 | √ | √ | 93.56 | 93.64 | 94.61 | 94.08 | 27.83 | |
模型7 | √ | √ | 94.16 | 92.27 | 95.26 | 93.59 | 28.25 | |
模型8 | √ | √ | √ | 94.85 | 94.70 | 95.46 | 95.02 | 28.25 |
1 | ZAMMIT S C, SIDHU R. Capsule endoscopy — recent developments and future directions[J]. Expert Review of Gastroenterology and Hepatology, 2021, 15(2): 127-137. |
2 | DRAY X, IAKOVIDIS D, HOUDEVILLE C, et al. Artificial intelligence in small bowel capsule endoscopy — current status, challenges and future promise[J]. Journal of Gastroenterology and Hepatology, 2021, 36(1): 12-19. |
3 | 吴海迪,杨景玉,吴振伦,等. 胶囊内镜中人工智能的应用现状[J]. 临床医学研究与实践, 2024, 9(7):195-198. |
WU H D, YANG J Y, WU Z L, et al. Application status of artificial intelligence in capsule endoscopy[J]. Clinical Research and Practice, 2024, 9(7): 195-198. | |
4 | XIAO Z, FENG L N. A study on wireless capsule endoscopy for small intestinal lesions detection based on deep learning target detection[J]. IEEE Access, 2020, 8: 159017-159026. |
5 | SUMAN S, HUSSIN F A B, MALIK A S, et al. Detection and classification of bleeding region in WCE images using color feature[C]// Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing. New York: ACM, 2017: No.17. |
6 | LIU G, YAN G, KUANG S, et al. Detection of small bowel tumor based on multi-scale curvelet analysis and fractal technology in capsule endoscopy[J]. Computers in Biology and Medicine, 2016, 70: 131-138. |
7 | POGORELOV K, SUMAN S, HUSSIN F A, et al. Bleeding detection in wireless capsule endoscopy videos — color versus texture features [J]. Journal of Applied Clinical Medical Physics, 2019, 20(8): 141-154. |
8 | AMIRI Z, HASSANPOUR H, BEGHDADI A. Abnormalities detection in wireless capsule endoscopy images using EM algorithm[J]. The Visual Computer, 2023, 39(7): 2999-3010. |
9 | HWANG Y, LEE H H, PARK C, et al. Improved classification and localization approach to small bowel capsule endoscopy using convolutional neural network[J]. Digestive Endoscopy, 2021, 33(4): 598-607. |
10 | MURUGANANTHAM P, BALAKRISHNAN S M. Attention aware deep learning model for wireless capsule endoscopy lesion classification and localization[J]. Journal of Medical and Biological Engineering, 2022, 42(2): 157-168. |
11 | MARIN-SANTOS D, CONTRERAS-FERNANDEZ J A, PEREZ-BORRERO I, et al. Automatic detection of Crohn disease in wireless capsule endoscopic images using a deep convolutional neural network [J]. Applied Intelligence, 2023, 53(10): 12632-12646. |
12 | 杨昆,孙宇锋,汪世伟,等. YOLOF-CBAM:一种新的结直肠息肉实时分类与检测方法[J]. 电子测量技术, 2023, 46(16):138-147. |
YANG K, SUN Y F, WANG S W, et al. YOLOF-CBAM: a new real-time classification and detection method for colorectal polyps[J]. Electronic Measurement Technology, 2023, 46(16): 138-147. | |
13 | SOUAIDI M, LAFRAXO S, KERKAOU Z, et al. A multiscale polyp detection approach for GI tract images based on improved DenseNet and single-shot multi-box detector[J]. Diagnostics, 2023, 13(4): No.733. |
14 | JAIN S, SEAL A, OJHA A, et al. Detection of abnormality in wireless capsule endoscopy images using fractal features[J]. Computers in Biology and Medicine, 2020, 127: No.104094. |
15 | 安晨,汪成亮,廖超,等. 基于注意力关系网络的无线胶囊内镜图像分类方法[J]. 计算机工程, 2021, 47(10):252-259, 268. |
AN C, WANG C L, LIAO C, et al. Wireless capsule endoscopy image classification method based on attention relational network[J]. Computer Engineering, 2021, 47(10): 252-259, 268. | |
16 | XIAO P, PAN Y, CAI F, et al. A deep learning based framework for the classification of multi-class capsule gastroscope image in gastroenterologic diagnosis[J]. Frontiers in Physiology, 2022, 13: No.1060591. |
17 | MOHAPATRA S, KUMAR PATI G, MISHRA M, et al. Gastrointestinal abnormality detection and classification using empirical wavelet transform and deep convolutional neural network from endoscopic images[J]. Ain Shams Engineering Journal, 2023, 14(4): No.101942. |
18 | MUKHTOROV D, RAKHMONOVA M, MUKSIMOVA S, et al. Endoscopic image classification based on explainable deep learning[J]. Sensors, 2023, 23(6): No.3176. |
19 | 俞敏. 基于消化道胶囊内窥镜影像的器官分类算法研究[D]. 杭州:浙江工业大学, 2020. |
YU M. Study on organ classification of gastrointestinal capsule endoscope images[D]. Hangzhou: Zhejiang University of Technology, 2020. | |
20 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
21 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. [2024-03-22].. |
22 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
23 | LIU Z, MAO H, WU C Y, et al. A ConvNet for the 2020s[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11966-11976. |
24 | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical Vision Transformer using shifted windows[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. |
25 | YANG L, ZHANG R Y, LI L, et al. SimAM: a simple, parameter-free attention module for convolutional neural networks[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 11863-11874. |
26 | CAO Y, XU J, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE, 2019: 1971-1980. |
27 | WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7794-7803. |
28 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
29 | WANG Y, MA X, CHEN Z, et al. Symmetric cross entropy for robust learning with noisy labels[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 322-330. |
30 | WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham: Springer, 2016: 499-515. |
31 | SMEDSRUD P H, THAMBAWITA V, HICKS S A, et al. Kvasir‑Capsule, a video capsule endoscopy dataset[J]. Scientific Data, 2021, 8: No.142. |
32 | LIU Z, LV Q, LI Y, et al. MedAugment: universal automatic data augmentation plug-in for medical image analysis[EB/OL]. [2024-03-27].. |
33 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
34 | WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539. |
35 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13708-13717. |
36 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2024-03-20].. |
[1] | Haijie WANG, Guangxin ZHANG, Hai SHI, Shu CHEN. Document-level relation extraction based on entity representation enhancement [J]. Journal of Computer Applications, 2025, 45(6): 1809-1816. |
[2] | Sheping ZHAI, Yan HUANG, Qing YANG, Rui YANG. Multi-view entity alignment combining triples and text attributes [J]. Journal of Computer Applications, 2025, 45(6): 1793-1800. |
[3] | Weigang LI, Xinyi LI, Yongqiang WANG, Yuntao ZHAO. Point cloud classification and segmentation method based on adaptive dynamic graph convolution and parameter-free attention [J]. Journal of Computer Applications, 2025, 45(6): 1980-1986. |
[4] | Dan WANG, Wenhao ZHANG, Lijuan PENG. Channel estimation of reconfigurable intelligent surface assisted communication system based on deep learning [J]. Journal of Computer Applications, 2025, 45(5): 1613-1618. |
[5] | Man CHEN, Xiaojun YANG, Huimin YANG. Pedestrian trajectory prediction based on graph convolutional network and endpoint induction [J]. Journal of Computer Applications, 2025, 45(5): 1480-1487. |
[6] | Sijie NIU, Yuliang LIU. Auxiliary diagnostic method for retinopathy based on dual-branch structure with knowledge distillation [J]. Journal of Computer Applications, 2025, 45(5): 1410-1414. |
[7] | Lu CHEN, Huaiyao WANG, Jingyang LIU, Tao YAN, Bin CHEN. Robotic grasp detection with feature fusion of spatial-Fourier domain information under low-light environments [J]. Journal of Computer Applications, 2025, 45(5): 1686-1693. |
[8] | Hui LI, Bingzhi JIA, Chenxi WANG, Ziyu DONG, Jilong LI, Zhaoman ZHONG, Yanyan CHEN. Generative adversarial network underwater image enhancement model based on Swin Transformer [J]. Journal of Computer Applications, 2025, 45(5): 1439-1446. |
[9] | Chun XU, Shuangyan JI, Huan MA, Enwei SUN, Mengmeng WANG, Mingyu SU. Consultation recommendation method based on knowledge graph and dialogue structure [J]. Journal of Computer Applications, 2025, 45(4): 1157-1168. |
[10] | Shiyue GUO, Jianwu DANG, Yangping WANG, Jiu YONG. 3D hand pose estimation combining attention mechanism and multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(4): 1293-1299. |
[11] | Jie HU, Qiyang ZHENG, Jun SUN, Yan ZHANG. Multi-label classification model based on multi-label relational graph and local dynamic reconstruction learning [J]. Journal of Computer Applications, 2025, 45(4): 1104-1112. |
[12] | Yiqin YAN, Chuan LUO, Tianrui LI, Hongmei CHEN. Cross-domain few-shot classification model based on relation network and Vision Transformer [J]. Journal of Computer Applications, 2025, 45(4): 1095-1103. |
[13] | Liqin WANG, Zhilei GENG, Yingshuang LI, Yongfeng DONG, Meng BIAN. Open-world knowledge reasoning model based on path and enhanced triplet text [J]. Journal of Computer Applications, 2025, 45(4): 1177-1183. |
[14] | Liwei ZHANG, Quan LIANG, Yutao HU, Qiaole ZHU. Channel shuffle attention mechanism based on group convolution [J]. Journal of Computer Applications, 2025, 45(4): 1069-1076. |
[15] | Kunyuan JIANG, Xiaoxia LI, Li WANG, Yaodan CAO, Xiaoqiang ZHANG, Nan DING, Yingyue ZHOU. Boundary-cross supervised semantic segmentation network with decoupled residual self-attention [J]. Journal of Computer Applications, 2025, 45(4): 1120-1129. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||