《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (7): 2264-2270.DOI: 10.11772/j.issn.1001-9081.2023070956
收稿日期:2023-07-17
									
				
											修回日期:2023-09-10
									
				
											接受日期:2023-09-20
									
				
											发布日期:2023-10-26
									
				
											出版日期:2024-07-10
									
				
			通讯作者:
					彭博
							作者简介:龙伍丹(1998—),女,重庆人,硕士研究生,主要研究方向:深度学习、目标检测;基金资助:
        
                                                                                                                                            Wudan LONG1, Bo PENG1( ), Jie HU1, Ying SHEN1,2, Danni DING3
), Jie HU1, Ying SHEN1,2, Danni DING3
			  
			
			
			
                
        
    
Received:2023-07-17
									
				
											Revised:2023-09-10
									
				
											Accepted:2023-09-20
									
				
											Online:2023-10-26
									
				
											Published:2024-07-10
									
			Contact:
					Bo PENG   
							About author:LONG Wudan, born in 1998, M. S. candidate. Her research interests include deep learning, object detection.Supported by:摘要:
针对道路病害区域小、类别数量不均衡导致检测困难的问题,提出基于YOLOv7-tiny的道路病害检测算法RDD-YOLO。首先,采用K-means++算法得到拟合目标尺寸更好的锚框。其次,在小目标检测支路上使用量化感知重参数化模块(QARepVGG),增强浅层特征提取,同时构建加强注意力模块(AM-CBAM)嵌入颈部的3个输入,抑制复杂背景干扰。然后,设计特征融合模块(Res-RFB),模拟人眼扩大感受野融合多尺度信息,提高表征能力;另外,构造轻量级解耦头(S-DeHead)提高小目标检测精确率。最后,采用归一化Wasserstein距离度量(NWD)优化小目标定位过程,并缓解样本不均衡问题。实验结果表明,与YOLOv7-tiny相比,RDD-YOLO算法在仅增加0.71×106参数量和1.7 GFLOPs计算量的成本下,mAP50提高6.19个百分点,F1-Score提高5.31个百分点,并且检测速度达到135.26 frame/s,满足道路养护工作中对检测精度和速度的需求。
中图分类号:
龙伍丹, 彭博, 胡节, 申颖, 丁丹妮. 基于加强特征提取的道路病害检测算法[J]. 计算机应用, 2024, 44(7): 2264-2270.
Wudan LONG, Bo PENG, Jie HU, Ying SHEN, Danni DING. Road damage detection algorithm based on enhanced feature extraction[J]. Journal of Computer Applications, 2024, 44(7): 2264-2270.
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +CBAM | 58.41 | 59.31 | 6.25 | 13.9 | 
| +AM-CBAM | 58.59 | 59.53 | 6.25 | 13.9 | 
表1 AM-CBAM与CBAM的性能对比
Tab. 1 Performance comparison of AM-CBAM and CBAM
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +CBAM | 58.41 | 59.31 | 6.25 | 13.9 | 
| +AM-CBAM | 58.59 | 59.53 | 6.25 | 13.9 | 
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +RFB | 57.68 | 57.94 | 6.66 | 14.2 | 
| +RFB3×3 | 57.84 | 58.78 | 6.62 | 14.2 | 
| +Res-RFB | 58.20 | 59.02 | 6.75 | 14.3 | 
表2 Res-RFB模块消融实验结果
Tab. 2 Res-RFB module ablation experiment results
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +RFB | 57.68 | 57.94 | 6.66 | 14.2 | 
| +RFB3×3 | 57.84 | 58.78 | 6.62 | 14.2 | 
| +Res-RFB | 58.20 | 59.02 | 6.75 | 14.3 | 
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +EffiDeHead | 58.25 | 59.08 | 9.96 | 34.8 | 
| +S-DeHead | 58.37 | 59.09 | 6.44 | 15.1 | 
表3 S-DeHead与EffiDeHead的性能对比
Tab. 3 Performance comparison between S-DeHead and EffiDeHead
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +EffiDeHead | 58.25 | 59.08 | 9.96 | 34.8 | 
| +S-DeHead | 58.37 | 59.09 | 6.44 | 15.1 | 
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +K-means++ | 57.75 | 58.51 | 6.23 | 13.9 | 
| +QARepVGG | 58.92 | 59.63 | 6.97 | 7.5 | 
| +AM-CBAM | 58.59 | 59.53 | 6.25 | 13.9 | 
| +Res-RFB | 58.20 | 59.02 | 6.75 | 14.3 | 
| +S-DeHead | 58.37 | 59.09 | 6.44 | 15.1 | 
| +NWDLoss | 58.17 | 58.91 | 6.23 | 13.9 | 
| RDD-YOLO | 63.51 | 63.03 | 6.94 | 15.6 | 
表4 本文算法在RDD2022数据集上的模块消融实验结果
Tab. 4 Module ablation experiment results of proposed algorithm on RDD2022 dataset
| 模型 | mAP50/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 
|---|---|---|---|---|
| YOLOv7-tiny | 57.32 | 57.72 | 6.23 | 13.9 | 
| +K-means++ | 57.75 | 58.51 | 6.23 | 13.9 | 
| +QARepVGG | 58.92 | 59.63 | 6.97 | 7.5 | 
| +AM-CBAM | 58.59 | 59.53 | 6.25 | 13.9 | 
| +Res-RFB | 58.20 | 59.02 | 6.75 | 14.3 | 
| +S-DeHead | 58.37 | 59.09 | 6.44 | 15.1 | 
| +NWDLoss | 58.17 | 58.91 | 6.23 | 13.9 | 
| RDD-YOLO | 63.51 | 63.03 | 6.94 | 15.6 | 
| 模型 | mAP50/% | mAP75/% | mAP50:95/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 帧率/( | 模型大小/MB | 
|---|---|---|---|---|---|---|---|---|
| YOLOv5s | 58.75 | 35.70 | 27.05 | 60.00 | 7.02 | 15.80 | 121.80 | 13.80 | 
| YOLOv6s | 56.10 | 35.37 | 26.12 | 56.87 | 18.52 | 45.30 | 109.89 | 36.50 | 
| YOLOv7-tiny | 57.32 | 37.74 | 26.82 | 57.72 | 6.23 | 13.90 | 166.67 | 11.74 | 
| YOLOv7 | 61.11 | 38.08 | 29.12 | 61.35 | 37.62 | 106.50 | 116.28 | 71.38 | 
| YOLOv8s | 57.04 | 35.43 | 31.47 | 56.08 | 11.13 | 28.40 | 123.46 | 21.48 | 
| Faster R-CNN | 57.82 | 36.44 | 27.46 | 58.47 | 41.53 | 91.41 | 96.24 | 86.50 | 
| SSD | 53.67 | 33.39 | 24.63 | 55.27 | 34.31 | 386.25 | 103.52 | 68.20 | 
| RDD-YOLO | 63.51 | 43.87 | 31.33 | 63.03 | 6.94 | 15.60 | 135.26 | 13.22 | 
表5 本文算法与其他7种算法综合性能比较
Tab. 5 Comprehensive performance comparison between proposed algorithm and other seven algorithms
| 模型 | mAP50/% | mAP75/% | mAP50:95/% | F1-Score/% | 参数量/106 | 计算量/GFLOPs | 帧率/( | 模型大小/MB | 
|---|---|---|---|---|---|---|---|---|
| YOLOv5s | 58.75 | 35.70 | 27.05 | 60.00 | 7.02 | 15.80 | 121.80 | 13.80 | 
| YOLOv6s | 56.10 | 35.37 | 26.12 | 56.87 | 18.52 | 45.30 | 109.89 | 36.50 | 
| YOLOv7-tiny | 57.32 | 37.74 | 26.82 | 57.72 | 6.23 | 13.90 | 166.67 | 11.74 | 
| YOLOv7 | 61.11 | 38.08 | 29.12 | 61.35 | 37.62 | 106.50 | 116.28 | 71.38 | 
| YOLOv8s | 57.04 | 35.43 | 31.47 | 56.08 | 11.13 | 28.40 | 123.46 | 21.48 | 
| Faster R-CNN | 57.82 | 36.44 | 27.46 | 58.47 | 41.53 | 91.41 | 96.24 | 86.50 | 
| SSD | 53.67 | 33.39 | 24.63 | 55.27 | 34.31 | 386.25 | 103.52 | 68.20 | 
| RDD-YOLO | 63.51 | 43.87 | 31.33 | 63.03 | 6.94 | 15.60 | 135.26 | 13.22 | 
| 数据集 | 模型 | mAP50 | mAP50:95 | F1-Score | 
|---|---|---|---|---|
| RDD2020-日本 | YOLOv7-tiny | 58.50 | 27.42 | 59.01 | 
| RDD-YOLO | 63.02 | 31.09 | 62.89 | |
| RDD2020-印度 | YOLOv7-tiny | 57.49 | 27.18 | 57.76 | 
| RDD-YOLO | 62.06 | 29.07 | 62.11 | 
表6 其他数据集泛化性实验结果 ( %)
Tab. 6 Generalization experiment results on other datasets
| 数据集 | 模型 | mAP50 | mAP50:95 | F1-Score | 
|---|---|---|---|---|
| RDD2020-日本 | YOLOv7-tiny | 58.50 | 27.42 | 59.01 | 
| RDD-YOLO | 63.02 | 31.09 | 62.89 | |
| RDD2020-印度 | YOLOv7-tiny | 57.49 | 27.18 | 57.76 | 
| RDD-YOLO | 62.06 | 29.07 | 62.11 | 
| 1 | 张伟,刘宁钟,寇金桥.基于深度特征金字塔的路面病害检测[J].计算机技术与发展, 2022, 32(12): 173-178. | 
| ZHANG W, LIU N Z, KOU J Q. Pavement disease detection based on depth feature pyramids [J]. Computer Technology and Development, 2022, 32(12): 173-178. | |
| 2 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. | 
| 3 | LIN T-Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. | 
| 4 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 91-99. | 
| 5 | 丁启辰.基于抗干扰与注意力机制的城市场景道路检测研究[D].南宁:广西大学, 2022: 56-57. | 
| DING Q C. The research on road detection in urban scenes based on anti-disturbance and attention mechanisms [D]. Nanning: Guangxi University, 2022: 56-57. | |
| 6 | 许正森,雷相达,管海燕.多尺度局部特征增强Transformer道路裂缝检测模型[J].中国图象图形学报, 2023, 28(4): 1019-1028. | 
| XU Z S, LEI X D, GUAN H Y. Multi-scale local feature enhanced Transformer network for pavement crack detection [J]. Journal of Image and Graphics, 2023, 28(4): 1019-1028. | |
| 7 | 任安虎,姜子渊,马晨浩.基于改进YOLOv5s的道路裂缝检测算法[J/OL].激光杂志: 1-7[2023-06-28]. . | 
| REN A H, JIANG Z Y, MA C H. Road crack detection algorithm based on improved YOLOv 5s [J/OL]. Laser Journal: 1-7[2023-06-28]. . | |
| 8 | LIU Y, SHAO Z, HOFFMANN N. Global attention mechanism: retain information to enhance channel-spatial interactions [EB/OL]. [2023-06-29]. . | 
| 9 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [EB/OL]. [2023-06-30]. . | 
| 10 | LIN T-Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 13th European Conference on Computer Vision. Cham: Springer, 2014: 740-755. | 
| 11 | WANG C-Y, BOCHKOVSKIY A, LIAO H-Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 7464-7475. | 
| 12 | ARTHUR D, VASSILVITSKII S. K-means++: the advantages of careful seeding [C]// Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. New York: ACM, 2007: 1027-1035. | 
| 13 | CHU X, LI L, ZHANG B. Make RepVGG greater again: a quantization-aware approach [EB/OL]. [2023-07-02]. . | 
| 14 | WANG J, XU C, YANG W, et al. A normalized Gaussian Wasserstein distance for tiny object detection [EB/OL]. [2023-07-02]. . | 
| 15 | HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. | 
| 16 | DING X, ZHANG X, MA N, et al. RepVGG: making VGG-style convnets great again [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13728-13737. | 
| 17 | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 448-456. | 
| 18 | RAMACHANDRAN P, ZOPH B, LE Q V. Searching for activation functions [EB/OL]. [2023-07-04]. . | 
| 19 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. | 
| 20 | LIU S, HUANG D, WANG Y. Receptive field block net for accurate and fast object detection [C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 404-419. | 
| 21 | WOO S, PARK J, LEE J-Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. | 
| 22 | LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications [EB/OL]. [2023-07-05]. . | 
| 23 | ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [J]. Proceedings of the AAAI Conference on Artificial Intelligence 2020, 34(7): 12993-13000. | 
| 24 | ZHANG Y-F, REN W, ZHANG Z, et al. Focal and efficient IoU loss for accurate bounding box regression [J]. Neurocomputing, 2022, 506: 146-157. | 
| 25 | ARYA D, MAEDA H, GHOSH S K, et al. RDD2022: a multi-national image dataset for automatic road damage detection [EB/OL]. [2023-07-07]. . | 
| 26 | ARYA D, MAEDA H, GHOSH S K, et al. Deep learning-based road damage detection and classification for multiple countries [J]. Automation in Construction, 2021, 132: 103935. | 
| 27 | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 618-626. | 
| 28 | JOCHER G. YOLOv5 [EB/OL]. (2020-05-18) [2023-07-08]. . | 
| 29 | JOCHER G. YOLOv8 [EB/OL]. (2023-01-12) [2023-07-12]. . | 
| 30 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]// Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 21-37. | 
| [1] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. | 
| [2] | 李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587. | 
| [3] | 姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207. | 
| [4] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. | 
| [5] | 崔晨辉, 蔺素珍, 李大威, 禄晓飞, 武杰. 基于孪生网络和Transformer的红外弱小目标跟踪方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 563-571. | 
| [6] | 庞玉东, 李志星, 刘伟杰, 李天昊, 王宁宁. 基于改进实时检测Transformer的塔机上俯视场景小目标检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3922-3929. | 
| [7] | 刘涛, 鞠事宏, 高一萌. 基于改进YOLOv8n的无人机视角下小目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3603-3609. | 
| [8] | 王林, 刘景亮, 王无为. 基于空洞卷积融合Transformer的无人机图像小目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3595-3602. | 
| [9] | 梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618. | 
| [10] | 唐鑫, 彭博, 滕飞. 基于状态信息的红外小目标跟踪方法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1938-1942. | 
| [11] | 吕宗喆, 徐慧, 杨骁, 王勇, 王唯鉴. 面向小目标的YOLOv5安全帽检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1943-1949. | 
| [12] | 公海涛, 陈志华, 盛斌, 祝冰艳. 基于孪生网络和Transformer的小目标跟踪算法SiamTrans[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3733-3739. | 
| [13] | 秦强强, 廖俊国, 周弋荀. 基于多分支混合注意力的小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3579-3586. | 
| [14] | 冯号, 黄朝兵, 文元桥. 基于改进YOLOv3的遥感图像小目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(12): 3723-3732. | 
| [15] | 向南, 潘传忠, 虞高翔. 融合优化特征提取结构的目标检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3558-3563. | 
| 阅读次数 | ||||||
| 全文 |  | |||||
| 摘要 |  | |||||