基于YOLOv9的交通路口图像的多目标检测算法

doi:10.11772/j.issn.1001-9081.2024071020

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (8): 2555-2565.DOI: 10.11772/j.issn.1001-9081.2024071020

• 人工智能 • 上一篇

基于YOLOv9的交通路口图像的多目标检测算法

廖炎华¹^,², 鄢元霞³, 潘文林⁴()

^1.云南民族大学电气信息工程学院，昆明 650504
^2.云南省无人自主系统重点实验室（云南民族大学），昆明 650504
^3.国网四川省电力公司成都市新津供电分公司，四川新津 611430
^4.云南民族大学数学与计算机科学学院，昆明 650504

收稿日期:2024-07-19 修回日期:2024-11-04 接受日期:2024-11-04 发布日期:2024-11-19 出版日期:2025-08-10
通讯作者: 潘文林
作者简介:廖炎华（2000—），男，江西宜春人，硕士研究生，主要研究方向：图像处理
鄢元霞（1997—），女，四川成都人，硕士研究生，主要研究方向：图像处理
基金资助:
国家自然科学基金资助项目(62362071)

Multi-target detection algorithm for traffic intersection images based on YOLOv9

Yanhua LIAO¹^,², Yuanxia YAN³, Wenlin PAN⁴()

^1.School of Electrical and Information Engineering，Yunnan Minzu University，Kunming Yunnan 650504，China
^2.Yunnan Key Laboratory of Unmanned Autonomous Systems （Yunnan Minzu University），Kunming Yunnan 650504，China
^3.Chengdu Xinjin Electric Power Supply Branch Company，State Grid Sichuan Electric Power Company，Xinjin Sichuan 611430，China
^4.School of Mathematics and Computer Science，Yunnan Minzu University，Kunming Yunnan 650504，China

Received:2024-07-19 Revised:2024-11-04 Accepted:2024-11-04 Online:2024-11-19 Published:2025-08-10
Contact: Wenlin PAN
About author:LIAO Yanhua， born in 2000， M. S. candidate. His research interests include image processing.
YAN Yuanxia， born in 1997， M. S. candidate. Her research interests include image processing.
Supported by:
National Natural Science Foundation of China(62362071)

摘要/Abstract

摘要：

针对交通路口图像复杂，小目标难测且目标之间易遮挡以及天气和光照变化引发的颜色失真、噪声和模糊等问题，提出一种基于YOLOv9（You Only Look Once version 9）的交通路口图像的多目标检测算法ITD-YOLOv9（Intersection Target Detection-YOLOv9）。首先，设计CoT-CAFRNet （Chain-of-Thought prompted Content-Aware Feature Reassembly Network）图像增强网络，以提升图像质量，并优化输入特征；其次，加入通道自适应特征融合（iCAFF）模块，以增强小目标及重叠遮挡目标的提取能力；再次，提出特征融合金字塔结构BiHS-FPN （Bi-directional High-level Screening Feature Pyramid Network），以增强多尺度特征的融合能力；最后，设计IF-MPDIoU （Inner-Focaler-Minimum Point Distance based Intersection over Union）损失函数，以通过调整变量因子，聚焦关键样本，并增强泛化能力。实验结果表明，在自制数据集和SODA10M数据集上，ITD-YOLOv9算法的检测精度分别为83.8%和56.3%，检测帧率分别为64.8 frame/s和57.4 frame/s。与YOLOv9算法相比，ITD-YOLOv9算法的检测精度分别提升了3.9和2.7个百分点。可见，所提算法有效实现了交通路口的多目标检测。

关键词: YOLOv9, 交通路口检测, 自适应融合, 多目标检测, 深度学习

Abstract:

Aiming at the problem of complex traffic intersection images， the difficulty in detecting small targets， and the tendency for occlusion between targets， as well as the color distortion， noise， and blurring caused by changes in weather and lighting， a multi-target detection algorithm ITD-YOLOv9（Intersection Target Detection-YOLOv9） for traffic intersection images based on YOLOv9 （You Only Look Once version 9） was proposed. Firstly， the CoT-CAFRNet （Chain-of-Thought prompted Content-Aware Feature Reassembly Network） image enhancement network was designed to improve image quality and optimize input features. Secondly， the iterative Channel Adaptive Feature Fusion （iCAFF） module was added to enhance feature extraction for small targets as well as overlapped and occluded targets. Thirdly， the feature fusion pyramid structure BiHS-FPN （Bi-directional High-level Screening Feature Pyramid Network） was proposed to enhance multi-scale feature fusion capability. Finally， the IF-MPDIoU （Inner-Focaler-Minimum Point Distance based Intersection over Union） loss function was designed to focus on key samples and enhance generalization ability by adjusting variable factors. Experimental results show that on the self-made dataset and SODA10M dataset， ITD-YOLOv9 algorithm achieves 83.8% and 56.3% detection accuracies and 64.8 frame/s and 57.4 frame/s detection speeds， respectively； compared with YOLOv9 algorithm， the detection accuracies are improved by 3.9 and 2.7 percentage points respectively. It can be seen that the proposed algorithm realizes multi-target detection at traffic intersections effectively.

Key words: YOLOv9 (You Only Look Once version 9), traffic intersection detection, adaptive fusion, multi-target detection, deep learning

中图分类号:

TP391.41

廖炎华, 鄢元霞, 潘文林. 基于YOLOv9的交通路口图像的多目标检测算法[J]. 计算机应用, 2025, 45(8): 2555-2565.

Yanhua LIAO, Yuanxia YAN, Wenlin PAN. Multi-target detection algorithm for traffic intersection images based on YOLOv9[J]. Journal of Computer Applications, 2025, 45(8): 2555-2565.

图/表 20

图1 ITD-YOLOv9的网络结构

Fig. 1 Network structure of ITD-YOLOv9

图2 CoT-CAFRNet结构

Fig. 2 Structure of CoT-CAFRNet

图3 CPB模块结构

Fig. 3 Module structure of CPB

图4 CARAFE的模块结构

Fig. 4 Module structure of CARAFE

图5 iCAFF的模块结构

Fig. 5 Module structure of iCAFF

图6 通道重排机制原理图

Fig. 6 Schematic diagram of channel shuffle mechanism

图7 iAFF的模块结构

Fig. 7 Module structure of iAFF

图8 BiHS-FPN结构

Fig. 8 Structure of BiHS-FPN

图9 CA和SSF模块结构

Fig. 9 Module structure of CA and SSF

表1 训练超参数

Tab. 1 Training hyperparameters

参数	设置	参数	设置
Learning Rate	0.01	Batch Size	8
Image Size	640 $×$ 640	Epoch	100
Momentum	0.937	Weight Decay	0.000 5
Optimizer	SGD

表1 训练超参数

Tab. 1 Training hyperparameters

参数	设置	参数	设置
Learning Rate	0.01	Batch Size	8
Image Size	640 $×$ 640	Epoch	100
Momentum	0.937	Weight Decay	0.000 5
Optimizer	SGD

表2 ITD-YOLOv9和YOLOv9算法的检测目标精度对比

Tab. 2 Comparison of target detection accuracy between ITD-YOLOv9 and YOLOv9 algorithms

算法	mAP@0.5/%	Precision/%	Recall/%	帧率/（frame·s^-1）	AP@0.5/%
算法	mAP@0.5/%	Precision/%	Recall/%	帧率/（frame·s^-1）	类别0	类别1	类别2	类别3	类别4	类别5	类别6	类别7	类别8
YOLOv9	79.9	85.1	74.8	69.6	80.4	87.2	76.2	89.3	68.4	82.9	66.8	80.8	86.9
ITD-YOLOv9	83.8	88.2	76.6	64.8	83.8	95.5	79.2	94.2	72.8	88.8	69.5	82.7	87.4

图10 不同算法的实验结果对比

Fig. 10 Comparison of experimental results of different algorithms

表3 图像增强网络的对比实验结果

Tab. 3 Comparison experiment results of image enhancement networks

图像增强网络	mAP@0.5/%	帧率/（frame·s^-1）
YOLOv9（baseline）	79.9	69.6
+Retinexformer	80.5	60.8
+CPA-Enhancer	81.3	63.2
+CoT-CAFRNet	81.6	62.9

表4 特征金字塔的对比实验结果

Tab. 4 Comparison experiment results of feature pyramids

特征金字塔	mAP@0.5/%	帧率/（frame·s^-1）
PANet（baseline）	79.9	69.6
+BiFPN	80.6	79.1
+HS-FPN	79.7	87.3
+BiHS-FPN	82.0	73.4

表5 损失函数的对比实验结果

Tab. 5 Comparison experiment results of loss functions

损失函数	mAP@0.5/%	帧率/（frame·s^-1）
CIoU（baseline）	79.9	69.6
SIoU	79.3	74.5
MPDIoU	80.7	68.5
IF-MPDIoU	81.5	68.1

表6 损失函数调节因子的对比实验结果

Tab. 6 Comparison experiment results of loss function adjustment factors

调节因子	mAP@0.5/%
CIoU（baseline）	79.9
ratio=0.7， d=0， u=0.95	80.9
ratio=1.0， d=0， u=0.95	81.2
ratio=1.3， d=0， u=0.95	80.8
ratio=1.0， d=0， u=0.98	81.4
ratio=1.0， d=0， u=0.92	81.5

表7 消融实验结果

Tab. 7 Ablation experiment results

方法改进	CoT-CAFRNet	BiHS-FPN	iCAFF	IF-MPDIoU	mAP@0.5/%	帧率/（frame·s^-1）
1					79.9	69.6
2	√				81.6	62.9
3		√			82.0	73.4
4			√		80.5	68.4
5				√	81.5	68.1
6	√	√			82.7	66.9
7	√		√		81.9	61.8
8		√	√		82.4	72.5
9	√	√	√		83.0	65.9
10	√	√	√	√	83.8	64.8

表8 对比实验结果

Tab. 8 Comparison experiment results

算法	帧率/（frame·s^-1）	mAP@0.5/%
Faster R-CNN^［7］	18.1	63.2
SSD^［9］	38.1	64.8
YOLOv5s^［29］	60.8	65.3
YOLOv7^［29］	32.3	68.8
YOLOv8s^［29］	62.4	70.7
YOLOv9^［19］	69.6	79.9
Dynamic R-CNN^［30］	31.2	51.5
Cascade R-CNN^［31］	26.8	57.0
Deformable DETR^［32］	23.3	74.9
RTMDet^［33］	17.4	80.2
ITD-YOLOv9	64.8	83.8

图11 不同算法的mAP@0.5曲线

Fig. 11 Curves of different algorithms in mAP@0.5

表9 泛化性的对比实验结果

Tab. 9 Comparison experiment results of generalization

算法	帧率/（frame·s^-1）	mAP@0.5/%
YOLOv9	63.7	53.6
Dynamic R-CNN	28.6	28.4
Cascade R-CNN	25.8	31.2
Deformable DETR	20.2	39.4
RTMDet	16.1	43.2
ITD-YOLOv9	57.4	56.3

参考文献 33

[1]	肖雨晴，杨慧敏. 目标检测算法在交通场景中应用综述［J］. 计算机工程与应用， 2021， 57（6）：30-41.
	XIAO Y Q， YANG H M. Research on application of object detection algorithm in traffic scene［J］. Computer Engineering and Applications， 2021， 57（6）：30-41.
[2]	VIOLA P， JONES M. Rapid object detection using a boosted cascade of simple features［C］// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway： IEEE， 2001： 1-9.
[3]	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway： IEEE， 2005： 886-893.
[4]	LOWE D G. Distinctive image features from scale-invariant key points［J］. International Journal of Computer Vision， 2004， 60（2）： 91-110.
[5]	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587.
[6]	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448.
[7]	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149.
[8]	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
[9]	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
[10]	LIAN J， YIN Y， LI L， et al. Small object detection in traffic scenes based on attention feature fusion［J］. Sensors， 2021， 21（9）： No.3031.
[11]	王译崧，华杭波，孔明，等. Rep-YOLOv8车辆行人检测分割算法［J］.现代电子技术， 2024， 47（9）：143-149.
	WANG Y S， HUA H B， KONG M， et al. Rep-YOLOv8 vehicle and pedestrian detection segmentation algorithm［J］. Modern Electronic Technology， 2024， 47（9）： 143-149.
[12]	单慧琳，吕宗奎，付相为，等. 改进YOLOv5s的交通多目标检测方法［J］. 国外电子测量技术， 2023， 42（4）：8-15.
	SHAN H L， LYU Z K， FU X W， et al. Traffic multi-target detection method of YOLOv5s is improved［J］. Foreign Electronic Measurement Technology， 2023， 42（4）： 8-15.
[13]	YUAN J， BARMPOUTIS P， STATHAKI T. Multi-scale deformable transformer encoder based single-stage pedestrian detection［C］// Proceedings of the 2022 IEEE International Conference on Image Processing. Piscataway： IEEE， 2022： 2906-2910.
[14]	LI N， BAI X， SHEN X， et al. Dense pedestrian detection based on GR-YOLO［J］. Sensors， 2024， 24（14）： No.4747.
[15]	刘辉，刘鑫满，刘大东. 面向复杂道路目标检测的YOLOv5算法优化研究［J］. 计算机工程与应用， 2023， 59（18）：207-217.
	LIU H， LIU X M， LIU D D. Research on optimization of YOLOv5 detection algorithm for object in complex road［J］. Computer Engineering and Applications， 2023， 59（18）： 207-217.
[16]	HUANG L， HUANG W. RD-YOLO： an effective and efficient object detector for roadside perception system［J］. Sensors， 2022， 22（21）： No.8097.
[17]	孔烜，彭佳强，张杰，等. 面向低光照环境的车辆目标检测方法［J］. 湖南大学学报（自然科学版）， 2025， 52（1）： 187-195.
	KONG X， PENG J Q， ZHANG J， et al. Vehicle object detection method for low-light environment［J］. Journal of Hunan University （Natural Sciences）， 2025， 52（1）： 187-195.
[18]	YI K， LUO K， CHEN T， et al. An improved YOLOX model and domain transfer strategy for nighttime pedestrian and vehicle detection［J］. Applied Sciences， 2022， 12（23）： No.12476.
[19]	WANG C Y， YEH I H， LIAO H Y M. YOLOv9： learning what you want to learn using programmable gradient information［C］// Proceedings of the 2024 European Conference on Computer Vision， LNCS 15089. Cham： Springer， 2025： 1-21.
[20]	ZHANG Y， WU Y， LIU Y， et al. CPA-Enhancer： chain-of-thought prompted adaptive enhancer for object detection under unknown degradations［C］// Proceedings of the 2025 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2025： 1-5.
[21]	WANG J， CHEN K， XU R， et al. CARAFE： content-aware reassembly of features［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3007-3016.
[22]	ZHANG X， ZHOU X， LIN M， et al. ShuffleNet： an extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6848-6856.
[23]	DAI Y， GIESEKE F， OEHMCKE S， et al. Attentional feature fusion［C］// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2021： 3559-3568.
[24]	TAN M， PANG R， LE Q V. EfficientDet： scalable and efficient object detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787.
[25]	CHEN Y， ZHANG C， CHEN B， et al. Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases［J］. Computers in Biology and Medicine， 2024， 170： No.107917.
[26]	MA S， XU Y. MPDIoU： a loss for efficient and accurate bounding box regression［EB/OL］. ［2024-06-14］..
[27]	ZHANG H， XU C， ZHANG S. Inner-IoU： more effective intersection over union loss with auxiliary bounding box［EB/OL］. ［2024-06-14］..
[28]	ZHANG H， ZHANG S. Focaler-IoU： more focused intersection over union loss［EB/OL］. ［2024-06-19］..
[29]	TERVEN J， CÓRDOVA-ESPARZA D M， ROMERO-GONZÁLEZ J A. A comprehensive review of YOLO architectures in computer vision： from YOLOv1 to YOLOv8 and YOLO-NAS［J］. Machine Learning and Knowledge Extraction， 2023， 5（4）： 1680-1716.
[30]	ZHANG H， CHANG H， MA B， et al. Dynamic R-CNN： towards high quality object detection via dynamic training［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12360. Cham： Springer， 2020： 260-275.
[31]	CAI Z， VASCONCELOS N. Cascade R-CNN： delving into high quality object detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6154-6162.
[32]	ZHU X， SU W， LU L， et al. Deformable DETR： deformable transformers for end-to-end object detection［EB/OL］. ［2023-03-18］..
[33]	LYU C， ZHANG W， HUANG H， et al. RTMDet： an empirical study of designing real-time object detectors［EB/OL］. ［2023-12-16］..

基于YOLOv9的交通路口图像的多目标检测算法

Multi-target detection algorithm for traffic intersection images based on YOLOv9

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 20

参考文献 33

相关文章 15

编辑推荐

Metrics

[1]	彭鹏, 蔡子婷, 刘雯玲, 陈才华, 曾维, 黄宝来. 基于CNN和双向GRU混合孪生网络的语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2515-2521.
[2]	张硕, 孙国凯, 庄园, 冯小雨, 王敬之. 面向区块链节点分析的eclipse攻击动态检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2428-2436.
[3]	葛丽娜, 王明禹, 田蕾. 联邦学习的高效性研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2387-2398.
[4]	索晋贤, 张丽萍, 闫盛, 王东奇, 张雅雯. 可解释的深度知识追踪方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2043-2055.
[5]	王震洲, 郭方方, 宿景芳, 苏鹤, 王建超. 面向智能巡检的视觉模型鲁棒性优化方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2361-2368.
[6]	齐巧玲, 王啸啸, 张茜茜, 汪鹏, 董永峰. 基于元学习的标签噪声自适应学习算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2113-2122.
[7]	赵小阳, 许新征, 李仲年. 物联网应用中的可解释人工智能研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2169-2179.
[8]	李岚皓, 严皓钧, 周号益, 孙庆赟, 李建欣. 基于神经网络的多尺度信息融合时间序列长期预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1776-1783.
[9]	花天辰, 马晓宁, 智慧. 基于浅层人工神经网络的可移植执行恶意软件静态检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1911-1921.
[10]	牛四杰, 刘昱良. 基于知识蒸馏双分支结构的视网膜病变辅助诊断方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1410-1414.
[11]	王文鹏, 秦寅畅, 师文轩. 工业缺陷检测无监督深度学习方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1658-1670.
[12]	李雪莹, 杨琨, 涂国庆, 刘树波. 基于局部增强的时序数据对抗样本生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1573-1581.
[13]	陈凯, 叶海良, 曹飞龙. 基于局部-全局交互与结构Transformer的点云分类算法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1671-1676.
[14]	王丹, 张文豪, 彭丽娟. 基于深度学习的智能反射面辅助通信系统信道估计[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1613-1618.
[15]	潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1300-1309.