Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (6): 1927-1934.DOI: 10.11772/j.issn.1001-9081.2023060775
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
Xiaohui CHENG1,2, Yuntian HUANG1, Ruifang ZHANG3()
Received:
2023-06-20
Revised:
2023-09-11
Accepted:
2023-09-12
Online:
2023-09-27
Published:
2024-06-10
Contact:
Ruifang ZHANG
About author:
CHENG Xiaohui, born in 1961, professor. His research interests include embedded systems, IoT, artificial intelligence.Supported by:
通讯作者:
张瑞芳
作者简介:
程小辉(1961—),男,江西樟树人,教授,主要研究方向:嵌入式系统、物联网、人工智能基金资助:
CLC Number:
Xiaohui CHENG, Yuntian HUANG, Ruifang ZHANG. Lightweight infrared road scene detection model based on multiscale and weighted coordinate attention[J]. Journal of Computer Applications, 2024, 44(6): 1927-1934.
程小辉, 黄云天, 张瑞芳. 基于多尺度和加权坐标注意力的轻量化红外道路场景检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1927-1934.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023060775
名称 | 环境配置 |
---|---|
操作系统 | Ubuntu16.04 |
处理器 | Intel Xeon Silver 4110 CPU @ 2.10 GHz |
显卡 | NVIDIA RTA 2080Ti (4块) |
深度学习框架 | PyTorch1.8.1,CUDA10.1,CUDNN7.6.4 |
Tab.1 Experiment environment configuration
名称 | 环境配置 |
---|---|
操作系统 | Ubuntu16.04 |
处理器 | Intel Xeon Silver 4110 CPU @ 2.10 GHz |
显卡 | NVIDIA RTA 2080Ti (4块) |
深度学习框架 | PyTorch1.8.1,CUDA10.1,CUDNN7.6.4 |
情况 | 参数量/ 106 | 浮点 运算量/106 | mAP(IoU=0.5)/% | F1/% |
---|---|---|---|---|
1 | 2.15 | 5.1 | 76.5 | 73.6 |
2 | 2.21 | 5.6 | 77.0 | 74.6 |
3 | 2.19 | 5.5 | 76.6 | 73.7 |
4 | 2.29 | 5.9 | 77.7 | 75.2 |
Tab.2 Comparison of experiment results in four scenarios
情况 | 参数量/ 106 | 浮点 运算量/106 | mAP(IoU=0.5)/% | F1/% |
---|---|---|---|---|
1 | 2.15 | 5.1 | 76.5 | 73.6 |
2 | 2.21 | 5.6 | 77.0 | 74.6 |
3 | 2.19 | 5.5 | 76.6 | 73.7 |
4 | 2.29 | 5.9 | 77.7 | 75.2 |
Input | op | exp | out | k | se | s |
---|---|---|---|---|---|---|
6402×1 | MB | — | 16 | 3 | — | 2 |
3202×16 | MB | 16 | 16 | 3 | — | 1 |
3202×16 | MB | 64 | 24 | 3 | — | 2 |
1602×24 | MB | 72 | 24 | 3 | — | 1 |
1602×24 | MB | 72 | 40 | 5 | √ | 2 |
802×40 | MBx | 128 | 40 | PSA | — | — |
802×40 | MB | 120 | 40 | 5 | √ | 1 |
802×40 | MB | 240 | 80 | 3 | — | 2 |
402×80 | MB | 200 | 80 | 3 | — | 1 |
402×80 | MB | 184 | 80 | 3 | — | 1 |
402×80 | MB | 184 | 80 | 3 | — | 1 |
402×80 | MBx | 256 | 112 | PSA | — | — |
402×112 | MB | 672 | 112 | 3 | √ | 1 |
402×112 | MB | 672 | 160 | 3 | √ | 2 |
202×160 | MBx | 192 | 256 | PSA | — | — |
Tab.3 Structure of MSM-Net
Input | op | exp | out | k | se | s |
---|---|---|---|---|---|---|
6402×1 | MB | — | 16 | 3 | — | 2 |
3202×16 | MB | 16 | 16 | 3 | — | 1 |
3202×16 | MB | 64 | 24 | 3 | — | 2 |
1602×24 | MB | 72 | 24 | 3 | — | 1 |
1602×24 | MB | 72 | 40 | 5 | √ | 2 |
802×40 | MBx | 128 | 40 | PSA | — | — |
802×40 | MB | 120 | 40 | 5 | √ | 1 |
802×40 | MB | 240 | 80 | 3 | — | 2 |
402×80 | MB | 200 | 80 | 3 | — | 1 |
402×80 | MB | 184 | 80 | 3 | — | 1 |
402×80 | MB | 184 | 80 | 3 | — | 1 |
402×80 | MBx | 256 | 112 | PSA | — | — |
402×112 | MB | 672 | 112 | 3 | √ | 1 |
402×112 | MB | 672 | 160 | 3 | √ | 2 |
202×160 | MBx | 192 | 256 | PSA | — | — |
alpha | mAP(IoU=0.5)/% | P/% | R/% | F1/% |
---|---|---|---|---|
0.3 | 78.2 | 79.6 | 70.5 | 74.7 |
0.4 | 77.0 | 79.1 | 70.7 | 74.6 |
0.5 | 78.1 | 77.3 | 73.1 | 75.1 |
0.6 | 77.4 | 77.9 | 71.2 | 74.3 |
0.7 | 76.8 | 81.0 | 69.6 | 74.8 |
Tab.4 Influence of alpha on WCA
alpha | mAP(IoU=0.5)/% | P/% | R/% | F1/% |
---|---|---|---|---|
0.3 | 78.2 | 79.6 | 70.5 | 74.7 |
0.4 | 77.0 | 79.1 | 70.7 | 74.6 |
0.5 | 78.1 | 77.3 | 73.1 | 75.1 |
0.6 | 77.4 | 77.9 | 71.2 | 74.3 |
0.7 | 76.8 | 81.0 | 69.6 | 74.8 |
模型 | M3 | MSM | WCA | EIoU | 参数量/106 | 浮点 运算量/106 | P/% | R/% | mAP(IoU=0.5)/% | F1/% |
---|---|---|---|---|---|---|---|---|---|---|
YOLOv7-tiny | × | × | × | × | 6.01 | 13.0 | 81.7 | 71.8 | 78.9 | 76.4 |
A | √ | × | × | × | 3.68 | 6.3 | 80.8 | 67.2 | 76.3 | 73.3 |
B | × | √ | × | × | 2.25 | 5.9 | 80.3 | 70.8 | 77.7 | 75.2 |
C | × | √ | √ | × | 2.29 | 5.9 | 77.3 | 73.1 | 78.1 | 75.1 |
D | × | √ | √ | √ | 2.29 | 5.9 | 80.2 | 71.0 | 78.2 | 75.3 |
Tab.5 Comparison of ablation experiment results
模型 | M3 | MSM | WCA | EIoU | 参数量/106 | 浮点 运算量/106 | P/% | R/% | mAP(IoU=0.5)/% | F1/% |
---|---|---|---|---|---|---|---|---|---|---|
YOLOv7-tiny | × | × | × | × | 6.01 | 13.0 | 81.7 | 71.8 | 78.9 | 76.4 |
A | √ | × | × | × | 3.68 | 6.3 | 80.8 | 67.2 | 76.3 | 73.3 |
B | × | √ | × | × | 2.25 | 5.9 | 80.3 | 70.8 | 77.7 | 75.2 |
C | × | √ | √ | × | 2.29 | 5.9 | 77.3 | 73.1 | 78.1 | 75.1 |
D | × | √ | √ | √ | 2.29 | 5.9 | 80.2 | 71.0 | 78.2 | 75.3 |
模型 | 参数量/106 | 浮点 运算量/106 | size/MB | FPS | AP/% | mAP(IoU=0.5)/% | F1/% | ||
---|---|---|---|---|---|---|---|---|---|
Car | Bicycle | Person | |||||||
YOLOv3-tiny | 8.67 | 12.9 | 16.61 | 507 | 86.3 | 52.5 | 75.1 | 71.3 | 70.8 |
YOLOv5s | 7.01 | 15.8 | 13.76 | 155 | 90.3 | 62.6 | 83.0 | 78.6 | 76.2 |
ShuffleNet-YOLOv7-tiny | 4.49 | 8.5 | 8.91 | 107 | 87.8 | 49.0 | 77.5 | 71.5 | 71.2 |
EfficientNet-YOLOv7-tiny | 3.87 | 7.9 | 7.74 | 127 | 89.6 | 57.3 | 82.9 | 76.6 | 74.0 |
-YOLOv7-tiny | 3.68 | 6.3 | 7.36 | 103 | 89.2 | 57.4 | 82.4 | 76.3 | 73.3 |
YOLOv7-tiny | 6.01 | 13.0 | 11.72 | 156 | 91.3 | 61.5 | 83.8 | 78.9 | 76.4 |
YOLOv8n | 3.01 | 8.1 | 5.96 | 169 | 89.3 | 56.8 | 81.3 | 75.8 | 72.8 |
FS-YOLOv5s[ | 5.20 | — | 10.70 | — | 89.1 | 59.2 | 81.5 | 76.6 | — |
Strip-YOLOs [ | 8.10 | 19.3 | — | — | 90.5 | 67.1 | 84.8 | 80.7 | — |
MSC-YOLO | 2.29 | 5.9 | 4.63 | 101 | 89.2 | 62.3 | 83.1 | 78.2 | 75.3 |
Tab.6 Comparison of experimental results of different models
模型 | 参数量/106 | 浮点 运算量/106 | size/MB | FPS | AP/% | mAP(IoU=0.5)/% | F1/% | ||
---|---|---|---|---|---|---|---|---|---|
Car | Bicycle | Person | |||||||
YOLOv3-tiny | 8.67 | 12.9 | 16.61 | 507 | 86.3 | 52.5 | 75.1 | 71.3 | 70.8 |
YOLOv5s | 7.01 | 15.8 | 13.76 | 155 | 90.3 | 62.6 | 83.0 | 78.6 | 76.2 |
ShuffleNet-YOLOv7-tiny | 4.49 | 8.5 | 8.91 | 107 | 87.8 | 49.0 | 77.5 | 71.5 | 71.2 |
EfficientNet-YOLOv7-tiny | 3.87 | 7.9 | 7.74 | 127 | 89.6 | 57.3 | 82.9 | 76.6 | 74.0 |
-YOLOv7-tiny | 3.68 | 6.3 | 7.36 | 103 | 89.2 | 57.4 | 82.4 | 76.3 | 73.3 |
YOLOv7-tiny | 6.01 | 13.0 | 11.72 | 156 | 91.3 | 61.5 | 83.8 | 78.9 | 76.4 |
YOLOv8n | 3.01 | 8.1 | 5.96 | 169 | 89.3 | 56.8 | 81.3 | 75.8 | 72.8 |
FS-YOLOv5s[ | 5.20 | — | 10.70 | — | 89.1 | 59.2 | 81.5 | 76.6 | — |
Strip-YOLOs [ | 8.10 | 19.3 | — | — | 90.5 | 67.1 | 84.8 | 80.7 | — |
MSC-YOLO | 2.29 | 5.9 | 4.63 | 101 | 89.2 | 62.3 | 83.1 | 78.2 | 75.3 |
1 | 李强龙,周新文,位梦恩,等.基于条形池化和注意力机制的街道场景红外目标检测算法[J].计算机工程, 2023, 49(8):310-320. |
LI Q L, ZHOU X W, WEI M E, et al. Infrared target detection algorithm based on strip pooling and attention mechanism in street scene[J]. Computer Engineering, 2023,49(8):310-320. | |
2 | DAI X, YUAN X, WEI X. TIRNet: object detection in thermal infrared images for autonomous driving [J]. Applied Intelligence, 2020, 51(3): 1244-1261. |
3 | ZHANG H, LUO C, WANG Q, et al. A novel infrared video surveillance system using deep learning based techniques [J]. Multimedia Tools and Applications, 2018, 77: 26657-26676. |
4 | MURESAN M P, BREHAR R D, NEDEVSCHI S. Vision algorithms and embedded solution for pedestrian detection with far infrared camera [C]// Proceedings of the 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing. Piscataway: IEEE, 2014: 133-136. |
5 | VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features [C]// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2001:511-518. |
6 | AJAY A, DIXON K D M, SOWMYA V, et al. Aerial image classification using GURLS and LIBSVM [C]// Proceedings of the 2016 International Conference on Communication and Signal Processing. Piscataway: IEEE, 2016: 396-401. |
7 | HUANG D, WANG Y-H, WANG Y-D. A robust infrared face recognition method based on AdaBoost Gabor features [C]//Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition. Piscataway: IEEE, 2007:1114-1118. |
8 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2015: 91-99. |
9 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL].[2023-06-03]. . |
10 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [C]// Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
11 | LI S, LI Y, LI Y, et al. YOLO-FIRI: improved YOLOv5 for infrared image object detection [J]. IEEE Access, 2021, 9: 141861-141875. |
12 | 赵明,张浩然.一种基于跨域融合网络的红外目标检测方法[J].光子学报,2021,50(11):111001. |
ZHAO M, ZHANG H R. An infrared object detection method based on cross-domain fusion network[J]. Acta Photonica Sinica, 2021, 50(11):111001. | |
13 | 黄磊,杨媛,杨成煜,等.FS-YOLOv5:轻量化红外目标检测方法[J].计算机工程与应用,2023,59(9):215-224. |
HUANG L, YANG Y, YANG C Y, et al. FS-YOLOv5: lightweight infrared rode target detection method[J]. Computer Engineering and Applications, 2023,59(9):215-224. | |
14 | 秦鹏,唐川明,刘云峰,等.基于改进YOLOv3的红外目标检测方法[J].计算机工程,2022,48(3):211-219. |
QIN P, TANG C M, LIU Y F, et al. Infrared target detection method based on improved YOLOv3 [J].Computer Engineering, 2022, 48(3) :211-219. | |
15 | 谌海云,余鸿皓,王海川,等.基于改进YOLOX的红外目标检测算法[J].电子测量技术,2022,45(23):72-81. |
SHEN H Y, YU H H, WANG H C, et al. Object detection algorithm of thermal infrared images based on improved YOLOX[J]. Electronic Measurement Technology, 2022, 45(23):72-81. | |
16 | HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3 [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. |
17 | ZHAO X, ZHANG L, PANG Y, et al. A single stream network for robust and real-time RGB-D salient object detection [C]// Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020: 646-662. |
18 | FENG M, LU H, DING E. Attentive feedback network for boundary-aware salient object detection [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019:1623-1632. |
19 | ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network [C]// Proceedings of the 16th Asian Conference on Computer Vision. Cham: Springer,2021: 541-557. |
20 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
21 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018: 7132-7141. |
22 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021:13708-13717. |
23 | TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. |
24 | ZHANG Y F, REN W, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression [J]. Neurocomputing, 2021, 506: 146-157. |
[1] | Yanjun LI, Yaodong GE, Qi WANG, Weiguo ZHANG, Chen LIU. Improved KLEIN algorithm and its quantum analysis [J]. Journal of Computer Applications, 2024, 44(9): 2810-2817. |
[2] | Yan RONG, Jiawen LIU, Xinlei LI. Adaptive hybrid network for affective computing in student classroom [J]. Journal of Computer Applications, 2024, 44(9): 2919-2930. |
[3] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. |
[4] | Chenqian LI, Jun LIU. Ultrasound carotid plaque segmentation method based on semi-supervision and multi-scale cascaded attention [J]. Journal of Computer Applications, 2024, 44(8): 2604-2610. |
[5] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[6] | Wei LI, Xiaorong ZHANG, Peng CHEN, Qing LI, Changqing ZHANG. Crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution [J]. Journal of Computer Applications, 2024, 44(7): 2243-2249. |
[7] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. |
[8] | Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention [J]. Journal of Computer Applications, 2024, 44(7): 2047-2054. |
[9] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[10] | Yongjin ZHANG, Jian XU, Mingxing ZHANG. Lightweight algorithm for impurity detection in raw cotton based on improved YOLOv7 [J]. Journal of Computer Applications, 2024, 44(7): 2271-2278. |
[11] | Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG. Time series classification method based on multi-scale cross-attention fusion in time-frequency domain [J]. Journal of Computer Applications, 2024, 44(6): 1842-1847. |
[12] | Xiaogang SONG, Dongdong ZHANG, Pengfei ZHANG, Li LIANG, Xinhong HEI. Real-time object detection algorithm for complex construction environments [J]. Journal of Computer Applications, 2024, 44(5): 1605-1612. |
[13] | Jun FENG, Jiankang BI, Yiru HUO, Jiakuan LI. PIPNet: lightweight asphalt pavement crack image segmentation network [J]. Journal of Computer Applications, 2024, 44(5): 1520-1526. |
[14] | Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444. |
[15] | Huantong GENG, Zhenyu LIU, Jun JIANG, Zichen FAN, Jiaxing LI. Embedded road crack detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2024, 44(5): 1613-1618. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||