《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (8): 2555-2565.DOI: 10.11772/j.issn.1001-9081.2024071020
• 人工智能 • 上一篇
收稿日期:
2024-07-19
修回日期:
2024-11-04
接受日期:
2024-11-04
发布日期:
2024-11-19
出版日期:
2025-08-10
通讯作者:
潘文林
作者简介:
廖炎华(2000—),男,江西宜春人,硕士研究生,主要研究方向:图像处理基金资助:
Yanhua LIAO1,2, Yuanxia YAN3, Wenlin PAN4()
Received:
2024-07-19
Revised:
2024-11-04
Accepted:
2024-11-04
Online:
2024-11-19
Published:
2025-08-10
Contact:
Wenlin PAN
About author:
LIAO Yanhua, born in 2000, M. S. candidate. His research interests include image processing.Supported by:
摘要:
针对交通路口图像复杂,小目标难测且目标之间易遮挡以及天气和光照变化引发的颜色失真、噪声和模糊等问题,提出一种基于YOLOv9(You Only Look Once version 9)的交通路口图像的多目标检测算法ITD-YOLOv9(Intersection Target Detection-YOLOv9)。首先,设计CoT-CAFRNet (Chain-of-Thought prompted Content-Aware Feature Reassembly Network)图像增强网络,以提升图像质量,并优化输入特征;其次,加入通道自适应特征融合(iCAFF)模块,以增强小目标及重叠遮挡目标的提取能力;再次,提出特征融合金字塔结构BiHS-FPN (Bi-directional High-level Screening Feature Pyramid Network),以增强多尺度特征的融合能力;最后,设计IF-MPDIoU (Inner-Focaler-Minimum Point Distance based Intersection over Union)损失函数,以通过调整变量因子,聚焦关键样本,并增强泛化能力。实验结果表明,在自制数据集和SODA10M数据集上,ITD-YOLOv9算法的检测精度分别为83.8%和56.3%,检测帧率分别为64.8 frame/s和57.4 frame/s。与YOLOv9算法相比,ITD-YOLOv9算法的检测精度分别提升了3.9和2.7个百分点。可见,所提算法有效实现了交通路口的多目标检测。
中图分类号:
廖炎华, 鄢元霞, 潘文林. 基于YOLOv9的交通路口图像的多目标检测算法[J]. 计算机应用, 2025, 45(8): 2555-2565.
Yanhua LIAO, Yuanxia YAN, Wenlin PAN. Multi-target detection algorithm for traffic intersection images based on YOLOv9[J]. Journal of Computer Applications, 2025, 45(8): 2555-2565.
参数 | 设置 | 参数 | 设置 |
---|---|---|---|
Learning Rate | 0.01 | Batch Size | 8 |
Image Size | 640 | Epoch | 100 |
Momentum | 0.937 | Weight Decay | 0.000 5 |
Optimizer | SGD |
表1 训练超参数
Tab. 1 Training hyperparameters
参数 | 设置 | 参数 | 设置 |
---|---|---|---|
Learning Rate | 0.01 | Batch Size | 8 |
Image Size | 640 | Epoch | 100 |
Momentum | 0.937 | Weight Decay | 0.000 5 |
Optimizer | SGD |
算法 | mAP@0.5/% | Precision/% | Recall/% | 帧率/(frame·s-1) | AP@0.5/% | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
类别0 | 类别1 | 类别2 | 类别3 | 类别4 | 类别5 | 类别6 | 类别7 | 类别8 | |||||
YOLOv9 | 79.9 | 85.1 | 74.8 | 69.6 | 80.4 | 87.2 | 76.2 | 89.3 | 68.4 | 82.9 | 66.8 | 80.8 | 86.9 |
ITD-YOLOv9 | 83.8 | 88.2 | 76.6 | 64.8 | 83.8 | 95.5 | 79.2 | 94.2 | 72.8 | 88.8 | 69.5 | 82.7 | 87.4 |
表2 ITD-YOLOv9和YOLOv9算法的检测目标精度对比
Tab. 2 Comparison of target detection accuracy between ITD-YOLOv9 and YOLOv9 algorithms
算法 | mAP@0.5/% | Precision/% | Recall/% | 帧率/(frame·s-1) | AP@0.5/% | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
类别0 | 类别1 | 类别2 | 类别3 | 类别4 | 类别5 | 类别6 | 类别7 | 类别8 | |||||
YOLOv9 | 79.9 | 85.1 | 74.8 | 69.6 | 80.4 | 87.2 | 76.2 | 89.3 | 68.4 | 82.9 | 66.8 | 80.8 | 86.9 |
ITD-YOLOv9 | 83.8 | 88.2 | 76.6 | 64.8 | 83.8 | 95.5 | 79.2 | 94.2 | 72.8 | 88.8 | 69.5 | 82.7 | 87.4 |
图像增强网络 | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|
YOLOv9(baseline) | 79.9 | 69.6 |
+Retinexformer | 80.5 | 60.8 |
+CPA-Enhancer | 81.3 | 63.2 |
+CoT-CAFRNet | 81.6 | 62.9 |
表3 图像增强网络的对比实验结果
Tab. 3 Comparison experiment results of image enhancement networks
图像增强网络 | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|
YOLOv9(baseline) | 79.9 | 69.6 |
+Retinexformer | 80.5 | 60.8 |
+CPA-Enhancer | 81.3 | 63.2 |
+CoT-CAFRNet | 81.6 | 62.9 |
特征金字塔 | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|
PANet(baseline) | 79.9 | 69.6 |
+BiFPN | 80.6 | 79.1 |
+HS-FPN | 79.7 | 87.3 |
+BiHS-FPN | 82.0 | 73.4 |
表4 特征金字塔的对比实验结果
Tab. 4 Comparison experiment results of feature pyramids
特征金字塔 | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|
PANet(baseline) | 79.9 | 69.6 |
+BiFPN | 80.6 | 79.1 |
+HS-FPN | 79.7 | 87.3 |
+BiHS-FPN | 82.0 | 73.4 |
损失函数 | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|
CIoU(baseline) | 79.9 | 69.6 |
SIoU | 79.3 | 74.5 |
MPDIoU | 80.7 | 68.5 |
IF-MPDIoU | 81.5 | 68.1 |
表5 损失函数的对比实验结果
Tab. 5 Comparison experiment results of loss functions
损失函数 | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|
CIoU(baseline) | 79.9 | 69.6 |
SIoU | 79.3 | 74.5 |
MPDIoU | 80.7 | 68.5 |
IF-MPDIoU | 81.5 | 68.1 |
调节因子 | mAP@0.5/% |
---|---|
CIoU(baseline) | 79.9 |
ratio=0.7, d=0, u=0.95 | 80.9 |
ratio=1.0, d=0, u=0.95 | 81.2 |
ratio=1.3, d=0, u=0.95 | 80.8 |
ratio=1.0, d=0, u=0.98 | 81.4 |
ratio=1.0, d=0, u=0.92 | 81.5 |
表6 损失函数调节因子的对比实验结果
Tab. 6 Comparison experiment results of loss function adjustment factors
调节因子 | mAP@0.5/% |
---|---|
CIoU(baseline) | 79.9 |
ratio=0.7, d=0, u=0.95 | 80.9 |
ratio=1.0, d=0, u=0.95 | 81.2 |
ratio=1.3, d=0, u=0.95 | 80.8 |
ratio=1.0, d=0, u=0.98 | 81.4 |
ratio=1.0, d=0, u=0.92 | 81.5 |
方法 改进 | CoT-CAFRNet | BiHS-FPN | iCAFF | IF-MPDIoU | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|---|---|---|---|
1 | 79.9 | 69.6 | ||||
2 | √ | 81.6 | 62.9 | |||
3 | √ | 82.0 | 73.4 | |||
4 | √ | 80.5 | 68.4 | |||
5 | √ | 81.5 | 68.1 | |||
6 | √ | √ | 82.7 | 66.9 | ||
7 | √ | √ | 81.9 | 61.8 | ||
8 | √ | √ | 82.4 | 72.5 | ||
9 | √ | √ | √ | 83.0 | 65.9 | |
10 | √ | √ | √ | √ | 83.8 | 64.8 |
表7 消融实验结果
Tab. 7 Ablation experiment results
方法 改进 | CoT-CAFRNet | BiHS-FPN | iCAFF | IF-MPDIoU | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|---|---|---|---|
1 | 79.9 | 69.6 | ||||
2 | √ | 81.6 | 62.9 | |||
3 | √ | 82.0 | 73.4 | |||
4 | √ | 80.5 | 68.4 | |||
5 | √ | 81.5 | 68.1 | |||
6 | √ | √ | 82.7 | 66.9 | ||
7 | √ | √ | 81.9 | 61.8 | ||
8 | √ | √ | 82.4 | 72.5 | ||
9 | √ | √ | √ | 83.0 | 65.9 | |
10 | √ | √ | √ | √ | 83.8 | 64.8 |
算法 | 帧率/(frame·s-1) | mAP@0.5/% |
---|---|---|
Faster R-CNN[ | 18.1 | 63.2 |
SSD[ | 38.1 | 64.8 |
YOLOv5s[ | 60.8 | 65.3 |
YOLOv7[ | 32.3 | 68.8 |
YOLOv8s[ | 62.4 | 70.7 |
YOLOv9[ | 69.6 | 79.9 |
Dynamic R-CNN[ | 31.2 | 51.5 |
Cascade R-CNN[ | 26.8 | 57.0 |
Deformable DETR[ | 23.3 | 74.9 |
RTMDet[ | 17.4 | 80.2 |
ITD-YOLOv9 | 64.8 | 83.8 |
表8 对比实验结果
Tab. 8 Comparison experiment results
算法 | 帧率/(frame·s-1) | mAP@0.5/% |
---|---|---|
Faster R-CNN[ | 18.1 | 63.2 |
SSD[ | 38.1 | 64.8 |
YOLOv5s[ | 60.8 | 65.3 |
YOLOv7[ | 32.3 | 68.8 |
YOLOv8s[ | 62.4 | 70.7 |
YOLOv9[ | 69.6 | 79.9 |
Dynamic R-CNN[ | 31.2 | 51.5 |
Cascade R-CNN[ | 26.8 | 57.0 |
Deformable DETR[ | 23.3 | 74.9 |
RTMDet[ | 17.4 | 80.2 |
ITD-YOLOv9 | 64.8 | 83.8 |
算法 | 帧率/(frame·s-1) | mAP@0.5/% |
---|---|---|
YOLOv9 | 63.7 | 53.6 |
Dynamic R-CNN | 28.6 | 28.4 |
Cascade R-CNN | 25.8 | 31.2 |
Deformable DETR | 20.2 | 39.4 |
RTMDet | 16.1 | 43.2 |
ITD-YOLOv9 | 57.4 | 56.3 |
表9 泛化性的对比实验结果
Tab. 9 Comparison experiment results of generalization
算法 | 帧率/(frame·s-1) | mAP@0.5/% |
---|---|---|
YOLOv9 | 63.7 | 53.6 |
Dynamic R-CNN | 28.6 | 28.4 |
Cascade R-CNN | 25.8 | 31.2 |
Deformable DETR | 20.2 | 39.4 |
RTMDet | 16.1 | 43.2 |
ITD-YOLOv9 | 57.4 | 56.3 |
[1] | 肖雨晴,杨慧敏. 目标检测算法在交通场景中应用综述[J]. 计算机工程与应用, 2021, 57(6):30-41. |
XIAO Y Q, YANG H M. Research on application of object detection algorithm in traffic scene[J]. Computer Engineering and Applications, 2021, 57(6):30-41. | |
[2] | VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway: IEEE, 2001: 1-9. |
[3] | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway: IEEE, 2005: 886-893. |
[4] | LOWE D G. Distinctive image features from scale-invariant key points[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. |
[5] | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. |
[6] | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. |
[7] | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
[8] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. |
[9] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
[10] | LIAN J, YIN Y, LI L, et al. Small object detection in traffic scenes based on attention feature fusion[J]. Sensors, 2021, 21(9): No.3031. |
[11] | 王译崧,华杭波,孔明,等. Rep-YOLOv8车辆行人检测分割算法[J].现代电子技术, 2024, 47(9):143-149. |
WANG Y S, HUA H B, KONG M, et al. Rep-YOLOv8 vehicle and pedestrian detection segmentation algorithm[J]. Modern Electronic Technology, 2024, 47(9): 143-149. | |
[12] | 单慧琳,吕宗奎,付相为,等. 改进YOLOv5s的交通多目标检测方法[J]. 国外电子测量技术, 2023, 42(4):8-15. |
SHAN H L, LYU Z K, FU X W, et al. Traffic multi-target detection method of YOLOv5s is improved[J]. Foreign Electronic Measurement Technology, 2023, 42(4): 8-15. | |
[13] | YUAN J, BARMPOUTIS P, STATHAKI T. Multi-scale deformable transformer encoder based single-stage pedestrian detection[C]// Proceedings of the 2022 IEEE International Conference on Image Processing. Piscataway: IEEE, 2022: 2906-2910. |
[14] | LI N, BAI X, SHEN X, et al. Dense pedestrian detection based on GR-YOLO[J]. Sensors, 2024, 24(14): No.4747. |
[15] | 刘辉,刘鑫满,刘大东. 面向复杂道路目标检测的YOLOv5算法优化研究[J]. 计算机工程与应用, 2023, 59(18):207-217. |
LIU H, LIU X M, LIU D D. Research on optimization of YOLOv5 detection algorithm for object in complex road[J]. Computer Engineering and Applications, 2023, 59(18): 207-217. | |
[16] | HUANG L, HUANG W. RD-YOLO: an effective and efficient object detector for roadside perception system[J]. Sensors, 2022, 22(21): No.8097. |
[17] | 孔烜,彭佳强,张杰,等. 面向低光照环境的车辆目标检测方法[J]. 湖南大学学报(自然科学版), 2025, 52(1): 187-195. |
KONG X, PENG J Q, ZHANG J, et al. Vehicle object detection method for low-light environment[J]. Journal of Hunan University (Natural Sciences), 2025, 52(1): 187-195. | |
[18] | YI K, LUO K, CHEN T, et al. An improved YOLOX model and domain transfer strategy for nighttime pedestrian and vehicle detection[J]. Applied Sciences, 2022, 12(23): No.12476. |
[19] | WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[C]// Proceedings of the 2024 European Conference on Computer Vision, LNCS 15089. Cham: Springer, 2025: 1-21. |
[20] | ZHANG Y, WU Y, LIU Y, et al. CPA-Enhancer: chain-of-thought prompted adaptive enhancer for object detection under unknown degradations[C]// Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2025: 1-5. |
[21] | WANG J, CHEN K, XU R, et al. CARAFE: content-aware reassembly of features[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 3007-3016. |
[22] | ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. |
[23] | DAI Y, GIESEKE F, OEHMCKE S, et al. Attentional feature fusion[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 3559-3568. |
[24] | TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. |
[25] | CHEN Y, ZHANG C, CHEN B, et al. Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases[J]. Computers in Biology and Medicine, 2024, 170: No.107917. |
[26] | MA S, XU Y. MPDIoU: a loss for efficient and accurate bounding box regression[EB/OL]. [2024-06-14].. |
[27] | ZHANG H, XU C, ZHANG S. Inner-IoU: more effective intersection over union loss with auxiliary bounding box[EB/OL]. [2024-06-14].. |
[28] | ZHANG H, ZHANG S. Focaler-IoU: more focused intersection over union loss[EB/OL]. [2024-06-19].. |
[29] | TERVEN J, CÓRDOVA-ESPARZA D M, ROMERO-GONZÁLEZ J A. A comprehensive review of YOLO architectures in computer vision: from YOLOv1 to YOLOv8 and YOLO-NAS[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1680-1716. |
[30] | ZHANG H, CHANG H, MA B, et al. Dynamic R-CNN: towards high quality object detection via dynamic training[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12360. Cham: Springer, 2020: 260-275. |
[31] | CAI Z, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6154-6162. |
[32] | ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. [2023-03-18].. |
[33] | LYU C, ZHANG W, HUANG H, et al. RTMDet: an empirical study of designing real-time object detectors[EB/OL]. [2023-12-16].. |
[1] | 彭鹏, 蔡子婷, 刘雯玲, 陈才华, 曾维, 黄宝来. 基于CNN和双向GRU混合孪生网络的语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2515-2521. |
[2] | 张硕, 孙国凯, 庄园, 冯小雨, 王敬之. 面向区块链节点分析的eclipse攻击动态检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2428-2436. |
[3] | 葛丽娜, 王明禹, 田蕾. 联邦学习的高效性研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2387-2398. |
[4] | 索晋贤, 张丽萍, 闫盛, 王东奇, 张雅雯. 可解释的深度知识追踪方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2043-2055. |
[5] | 王震洲, 郭方方, 宿景芳, 苏鹤, 王建超. 面向智能巡检的视觉模型鲁棒性优化方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2361-2368. |
[6] | 齐巧玲, 王啸啸, 张茜茜, 汪鹏, 董永峰. 基于元学习的标签噪声自适应学习算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2113-2122. |
[7] | 赵小阳, 许新征, 李仲年. 物联网应用中的可解释人工智能研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2169-2179. |
[8] | 李岚皓, 严皓钧, 周号益, 孙庆赟, 李建欣. 基于神经网络的多尺度信息融合时间序列长期预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1776-1783. |
[9] | 花天辰, 马晓宁, 智慧. 基于浅层人工神经网络的可移植执行恶意软件静态检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1911-1921. |
[10] | 牛四杰, 刘昱良. 基于知识蒸馏双分支结构的视网膜病变辅助诊断方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1410-1414. |
[11] | 王文鹏, 秦寅畅, 师文轩. 工业缺陷检测无监督深度学习方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1658-1670. |
[12] | 李雪莹, 杨琨, 涂国庆, 刘树波. 基于局部增强的时序数据对抗样本生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1573-1581. |
[13] | 陈凯, 叶海良, 曹飞龙. 基于局部-全局交互与结构Transformer的点云分类算法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1671-1676. |
[14] | 王丹, 张文豪, 彭丽娟. 基于深度学习的智能反射面辅助通信系统信道估计[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1613-1618. |
[15] | 潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1300-1309. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||