Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (4): 1069-1076.DOI: 10.11772/j.issn.1001-9081.2024040525
• Artificial intelligence • Previous Articles Next Articles
Liwei ZHANG1,2, Quan LIANG1,2(), Yutao HU1,2, Qiaole ZHU1,2
Received:
2024-04-28
Revised:
2024-08-05
Accepted:
2024-08-08
Online:
2025-04-08
Published:
2025-04-10
Contact:
Quan LIANG
About author:
ZHANG Liwei, born in 2000, M. S. candidate. His research interests include object detection.Supported by:
张李伟1,2, 梁泉1,2(), 胡禹涛1,2, 朱乔乐1,2
通讯作者:
梁泉
作者简介:
张李伟(2000—),男,湖北阳新人,硕士研究生,主要研究方向:目标检测基金资助:
CLC Number:
Liwei ZHANG, Quan LIANG, Yutao HU, Qiaole ZHU. Channel shuffle attention mechanism based on group convolution[J]. Journal of Computer Applications, 2025, 45(4): 1069-1076.
张李伟, 梁泉, 胡禹涛, 朱乔乐. 基于分组卷积的通道重洗注意力机制[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1069-1076.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024040525
模型 | 参数量/ | 计算量/GFLOPs | Top-1/% | Top-5/% |
---|---|---|---|---|
ResNet50 | 23.71 | 1.31 | 77.26 | 93.63 |
+SE[ | 26.22 | 1.32 | 79.91+2.65 | 95.04+1.41 |
+NAM[ | 23.71 | 1.31 | 78.89+1.63 | 95.09+1.46 |
+SA[ | 23.71 | 1.31 | 79.92+2.66 | 95.00+1.37 |
+CA[ | 25.62 | 1.33 | 79.91+2.65 | 95.23+1.60 |
+EMA[ | 23.90 | 1.63 | 80.21+2.95 | 95.10+1.47 |
+CSA | 25.04 | 1.33 | 80.48+3.22 | 95.19+1.56 |
Tab. 1 Comparison results of different attention mechanisms on CIFAR-100 dataset
模型 | 参数量/ | 计算量/GFLOPs | Top-1/% | Top-5/% |
---|---|---|---|---|
ResNet50 | 23.71 | 1.31 | 77.26 | 93.63 |
+SE[ | 26.22 | 1.32 | 79.91+2.65 | 95.04+1.41 |
+NAM[ | 23.71 | 1.31 | 78.89+1.63 | 95.09+1.46 |
+SA[ | 23.71 | 1.31 | 79.92+2.66 | 95.00+1.37 |
+CA[ | 25.62 | 1.33 | 79.91+2.65 | 95.23+1.60 |
+EMA[ | 23.90 | 1.63 | 80.21+2.95 | 95.10+1.47 |
+CSA | 25.04 | 1.33 | 80.48+3.22 | 95.19+1.56 |
模型 | 参数量/ | 计算量/ GFLOPs | mAP@50/% | mAP@50:95/% |
---|---|---|---|---|
YOLOv5s(v6.0) | 7.24 | 16.6 | 56.0 | 37.2 |
+CBAM | 7.56 | 16.9 | 57.1+1.1 | 37.7+0.5 |
+SA | 7.24 | 16.6 | 56.8+0.8 | 37.4+0.2 |
+CA | 7.27 | 16.7 | 57.5+1.5 | 38.1+0.9 |
+EMA | 7.24 | 16.8 | 57.8+1.8 | 38.4+1.2 |
+CSA | 7.28 | 16.7 | 58.0+2.0 | 38.0+0.8 |
Tab. 2 Comparison results of different attention mechanisms on COCO2017 dataset
模型 | 参数量/ | 计算量/ GFLOPs | mAP@50/% | mAP@50:95/% |
---|---|---|---|---|
YOLOv5s(v6.0) | 7.24 | 16.6 | 56.0 | 37.2 |
+CBAM | 7.56 | 16.9 | 57.1+1.1 | 37.7+0.5 |
+SA | 7.24 | 16.6 | 56.8+0.8 | 37.4+0.2 |
+CA | 7.27 | 16.7 | 57.5+1.5 | 38.1+0.9 |
+EMA | 7.24 | 16.8 | 57.8+1.8 | 38.4+1.2 |
+CSA | 7.28 | 16.7 | 58.0+2.0 | 38.0+0.8 |
模型 | 参数量/ | 计算量/GFLOPs | mAP@50/% | mAP@50:95/% |
---|---|---|---|---|
YOLOv8s | 11.17 | 28.8 | 95.4 | 65.8 |
+SE | 11.19 | 28.8 | 95.4 | 65.9 |
+CBAM | 11.38 | 29.0 | 95.6 | 66.2 |
+CA | 11.20 | 28.9 | 95.3 | 66.0 |
+EMA | 11.17 | 29.0 | 95.2 | 66.0 |
+CSA | 11.20 | 28.9 | 95.5 | 66.3 |
Tab. 3 Comparison results of different attention mechanisms on SHWD dataset
模型 | 参数量/ | 计算量/GFLOPs | mAP@50/% | mAP@50:95/% |
---|---|---|---|---|
YOLOv8s | 11.17 | 28.8 | 95.4 | 65.8 |
+SE | 11.19 | 28.8 | 95.4 | 65.9 |
+CBAM | 11.38 | 29.0 | 95.6 | 66.2 |
+CA | 11.20 | 28.9 | 95.3 | 66.0 |
+EMA | 11.17 | 29.0 | 95.2 | 66.0 |
+CSA | 11.20 | 28.9 | 95.5 | 66.3 |
模型 | 参数量/ | 计算量/GFLOPs | 帧率/(frame·s-1) |
---|---|---|---|
YOLOv5s+CA | 7.27 | 16.7 | 232 |
YOLOv5s+EMA | 7.24 | 16.8 | 227 |
YOLOv5s+CSA | 7.28 | 16.7 | 243 |
YOLOv8s+CA | 11.20 | 28.9 | 215 |
YOLOv8s+EMA | 11.17 | 29.0 | 178 |
YOLOv8s+CSA | 11.20 | 28.9 | 238 |
Tab. 4 Comparison of complexity and speeds among three types of attention
模型 | 参数量/ | 计算量/GFLOPs | 帧率/(frame·s-1) |
---|---|---|---|
YOLOv5s+CA | 7.27 | 16.7 | 232 |
YOLOv5s+EMA | 7.24 | 16.8 | 227 |
YOLOv5s+CSA | 7.28 | 16.7 | 243 |
YOLOv8s+CA | 11.20 | 28.9 | 215 |
YOLOv8s+EMA | 11.17 | 29.0 | 178 |
YOLOv8s+CSA | 11.20 | 28.9 | 238 |
模型 | Top-1/% | Top-5/% | 帧率/(frame·s-1) |
---|---|---|---|
+CSA_11 | 79.97 | 94.87 | 95 |
+CSA_22 | 78.71 | 95.07 | 96 |
+CSA_21 | 79.98 | 95.08 | 94 |
+CSA(无顺序恢复) | 80.15 | 95.14 | 101 |
+CSA( | 79.79 | 94.69 | 110 |
+CSA | 80.48 | 95.19 | 96 |
Tab. 5 Ablation experimental results
模型 | Top-1/% | Top-5/% | 帧率/(frame·s-1) |
---|---|---|---|
+CSA_11 | 79.97 | 94.87 | 95 |
+CSA_22 | 78.71 | 95.07 | 96 |
+CSA_21 | 79.98 | 95.08 | 94 |
+CSA(无顺序恢复) | 80.15 | 95.14 | 101 |
+CSA( | 79.79 | 94.69 | 110 |
+CSA | 80.48 | 95.19 | 96 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems — Volume 1. Red Hook: Curran Associates Inc., 2012: 1097-1105. |
2 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
3 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. |
4 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2024-04-26].. |
5 | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2261-2269. |
6 | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2818-2826. |
7 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. |
8 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
9 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
10 | 肖斌,甘昀,汪敏,等. 基于端口注意力与通道空间注意力的网络异常流量检测[J]. 计算机应用, 2024, 44(4): 1027-1034. |
XIAO B, GAN Y, WANG M, et al. Network abnormal traffic detection based on port attention and convolutional block attention module[J]. Journal of Computer Applications, 2024, 44(4): 1027-1034. | |
11 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13713-13722. |
12 | YANG W, WU J, ZHANG J, et al. Deformable convolution and coordinate attention for fast cattle detection[J]. Computers and Electronics in Agriculture, 2023, 211: No.108006. |
13 | ZHAO D, CAI W, CUI L. Adaptive thresholding and coordinate attention-based tree-inspired network for aero-engine bearing health monitoring under strong noise[J]. Advanced Engineering Informatics, 2024, 61: No.102559. |
14 | ZHANG Q L, YANG Y B. SA-Net: shuffle attention for deep convolutional neural networks[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 2235-2239. |
15 | OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]// Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2023: 1-5. |
16 | WU T, DONG Y. YOLO-SE: improved YOLOv8 for remote sensing object detection and recognition[J]. Applied Sciences, 2023, 13(24): No.12977. |
17 | CHEN S, LI Y, ZHANG Y, et al. Soft X-ray image recognition and classification of maize seed cracks based on image enhancement and optimized YOLOv8 model[J]. Computers and Electronics in Agriculture, 2024, 216: No.108475. |
18 | LIU Y, SHAO Z, TENG Y, et al. NAM: normalization-based attention module[EB/OL]. [2024-04-26].. |
19 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2020: 1571-1580. |
20 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2024-04-26].. |
21 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
22 | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. |
23 | ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. |
24 | MA N, ZHANG X, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11218. Cham: Springer, 2018: 122-138. |
25 | LI X, HU X, YANG J. Spatial group-wise enhance: improving semantic feature learning in convolutional networks[EB/OL]. [2024-04-26].. |
26 | YANG K, CHANG S, TIAN Z, et al. Automatic polyp detection and segmentation using shuffle efficient channel attention network[J]. Alexandria Engineering Journal, 2022, 61(1): 917-926. |
27 | WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539. |
28 | GAO X, XU L, WANG F, et al. Multi-branch aware module with channel shuffle pixel-wise attention for lightweight image super-resolution[J]. Multimedia Systems, 2023, 29: 289-303. |
29 | LIU K, CHEN K, GUO L, et al. ShuffleMix: improving representations via channel-wise shuffle of interpolated hidden states[EB/OL]. [2024-04-26].. |
30 | LYU J, ZHANG S, QI Y, et al. AutoShuffleNet: learning permutation matrices via an exact Lipschitz continuous penalty in deep convolutional neural networks[C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2020: 608-616. |
31 | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 7464-7475. |
32 | WANG C Y, LIAO H Y M, YEH I H. Designing network design strategies through gradient path analysis[J]. Journal of Information Science and Engineering, 2023, 39(4): 975-995. |
33 | CHEN Y, KALANTIDIS Y, LI J, et al. A2-Nets: double attention networks[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2018: 350-359. |
34 | YANG L, ZHANG R Y, LI L, et al. SimAM: a simple, parameter-free attention module for convolutional neural networks[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 11863-11874. |
35 | LI X, WANG W, HU X, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 510-519. |
36 | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. |
37 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
38 | Ultralytics. YOLOv8[EB/OL]. [2024-04-26].. |
39 | Ultralytics. YOLOv5[EB/OL]. [2024-04-26].. |
40 | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 618-626. |
[1] | Yiqin YAN, Chuan LUO, Tianrui LI, Hongmei CHEN. Cross-domain few-shot classification model based on relation network and Vision Transformer [J]. Journal of Computer Applications, 2025, 45(4): 1095-1103. |
[2] | Jie HU, Qiyang ZHENG, Jun SUN, Yan ZHANG. Multi-label classification model based on multi-label relational graph and local dynamic reconstruction learning [J]. Journal of Computer Applications, 2025, 45(4): 1104-1112. |
[3] | Qingqing ZHAO, Bin HU. Moving pedestrian detection neural network with invariant global sparse contour point representation [J]. Journal of Computer Applications, 2025, 45(4): 1271-1284. |
[4] | Shiyue GUO, Jianwu DANG, Yangping WANG, Jiu YONG. 3D hand pose estimation combining attention mechanism and multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(4): 1293-1299. |
[5] | Chun XU, Shuangyan JI, Huan MA, Enwei SUN, Mengmeng WANG, Mingyu SU. Consultation recommendation method based on knowledge graph and dialogue structure [J]. Journal of Computer Applications, 2025, 45(4): 1157-1168. |
[6] | Yang HOU, Qiong ZHANG, Zixuan ZHAO, Zhengyu ZHU, Xiaobo ZHANG. YOLOv5s-MRD: efficient fire and smoke detection algorithm for complex scenarios based on YOLOv5s [J]. Journal of Computer Applications, 2025, 45(4): 1317-1324. |
[7] | Kunyuan JIANG, Xiaoxia LI, Li WANG, Yaodan CAO, Xiaoqiang ZHANG, Nan DING, Yingyue ZHOU. Boundary-cross supervised semantic segmentation network with decoupled residual self-attention [J]. Journal of Computer Applications, 2025, 45(4): 1120-1129. |
[8] | Meirong DING, Jinxin ZHUO, Yuwu LU, Qinglong LIU, Jicong LANG. Domain adaptation integrating environment label smoothing and nuclear norm discrepancy [J]. Journal of Computer Applications, 2025, 45(4): 1130-1138. |
[9] | Liqin WANG, Zhilei GENG, Yingshuang LI, Yongfeng DONG, Meng BIAN. Open-world knowledge reasoning model based on path and enhanced triplet text [J]. Journal of Computer Applications, 2025, 45(4): 1177-1183. |
[10] | Chuanhao ZHANG, Xiaohan TU, Xuehui GU, Bo XUAN. LiDAR-camera 3D object detection based on multi-modal information mutual guidance and supplementation [J]. Journal of Computer Applications, 2025, 45(3): 946-952. |
[11] | Haijun GENG, Yun DONG, Zhiguo HU, Haotian CHI, Jing YANG, Xia YIN. Encrypted traffic classification method based on Attention-1DCNN-CE [J]. Journal of Computer Applications, 2025, 45(3): 872-882. |
[12] | Songsen YU, Zhifan LIN, Guopeng XUE, Jianyu XU. Lightweight large-format tile defect detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2025, 45(2): 647-654. |
[13] | Danni DING, Bo PENG, Xi WU. VPNet: fatty liver ultrasound image classification method inspired by ventral pathway [J]. Journal of Computer Applications, 2025, 45(2): 662-669. |
[14] | Tianqi ZHANG, Shuang TAN, Xiwen SHEN, Juan TANG. Image watermarking method combining attention mechanism and multi-scale feature [J]. Journal of Computer Applications, 2025, 45(2): 616-623. |
[15] | Yan LI, Guanhua YE, Yawen LI, Meiyu LIANG. Enterprise ESG indicator prediction model based on richness coordination technology [J]. Journal of Computer Applications, 2025, 45(2): 670-676. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||