Multi-target detection algorithm for traffic intersection images based on YOLOv9

doi:10.11772/j.issn.1001-9081.2024071020

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (8): 2555-2565.DOI: 10.11772/j.issn.1001-9081.2024071020

• Artificial intelligence • Previous Articles

Multi-target detection algorithm for traffic intersection images based on YOLOv9

Yanhua LIAO¹^,², Yuanxia YAN³, Wenlin PAN⁴()

^1.School of Electrical and Information Engineering，Yunnan Minzu University，Kunming Yunnan 650504，China
^2.Yunnan Key Laboratory of Unmanned Autonomous Systems （Yunnan Minzu University），Kunming Yunnan 650504，China
^3.Chengdu Xinjin Electric Power Supply Branch Company，State Grid Sichuan Electric Power Company，Xinjin Sichuan 611430，China
^4.School of Mathematics and Computer Science，Yunnan Minzu University，Kunming Yunnan 650504，China

Received:2024-07-19 Revised:2024-11-04 Accepted:2024-11-04 Online:2024-11-19 Published:2025-08-10
Contact: Wenlin PAN
About author:LIAO Yanhua， born in 2000， M. S. candidate. His research interests include image processing.
YAN Yuanxia， born in 1997， M. S. candidate. Her research interests include image processing.
Supported by:
National Natural Science Foundation of China(62362071)

基于YOLOv9的交通路口图像的多目标检测算法

廖炎华¹^,², 鄢元霞³, 潘文林⁴()

^1.云南民族大学电气信息工程学院，昆明 650504
^2.云南省无人自主系统重点实验室（云南民族大学），昆明 650504
^3.国网四川省电力公司成都市新津供电分公司，四川新津 611430
^4.云南民族大学数学与计算机科学学院，昆明 650504

通讯作者: 潘文林
作者简介:廖炎华（2000—），男，江西宜春人，硕士研究生，主要研究方向：图像处理
鄢元霞（1997—），女，四川成都人，硕士研究生，主要研究方向：图像处理
基金资助:
国家自然科学基金资助项目(62362071)

Abstract

Abstract:

Aiming at the problem of complex traffic intersection images， the difficulty in detecting small targets， and the tendency for occlusion between targets， as well as the color distortion， noise， and blurring caused by changes in weather and lighting， a multi-target detection algorithm ITD-YOLOv9（Intersection Target Detection-YOLOv9） for traffic intersection images based on YOLOv9 （You Only Look Once version 9） was proposed. Firstly， the CoT-CAFRNet （Chain-of-Thought prompted Content-Aware Feature Reassembly Network） image enhancement network was designed to improve image quality and optimize input features. Secondly， the iterative Channel Adaptive Feature Fusion （iCAFF） module was added to enhance feature extraction for small targets as well as overlapped and occluded targets. Thirdly， the feature fusion pyramid structure BiHS-FPN （Bi-directional High-level Screening Feature Pyramid Network） was proposed to enhance multi-scale feature fusion capability. Finally， the IF-MPDIoU （Inner-Focaler-Minimum Point Distance based Intersection over Union） loss function was designed to focus on key samples and enhance generalization ability by adjusting variable factors. Experimental results show that on the self-made dataset and SODA10M dataset， ITD-YOLOv9 algorithm achieves 83.8% and 56.3% detection accuracies and 64.8 frame/s and 57.4 frame/s detection speeds， respectively； compared with YOLOv9 algorithm， the detection accuracies are improved by 3.9 and 2.7 percentage points respectively. It can be seen that the proposed algorithm realizes multi-target detection at traffic intersections effectively.

Key words: YOLOv9 (You Only Look Once version 9), traffic intersection detection, adaptive fusion, multi-target detection, deep learning

摘要：

针对交通路口图像复杂，小目标难测且目标之间易遮挡以及天气和光照变化引发的颜色失真、噪声和模糊等问题，提出一种基于YOLOv9（You Only Look Once version 9）的交通路口图像的多目标检测算法ITD-YOLOv9（Intersection Target Detection-YOLOv9）。首先，设计CoT-CAFRNet （Chain-of-Thought prompted Content-Aware Feature Reassembly Network）图像增强网络，以提升图像质量，并优化输入特征；其次，加入通道自适应特征融合（iCAFF）模块，以增强小目标及重叠遮挡目标的提取能力；再次，提出特征融合金字塔结构BiHS-FPN （Bi-directional High-level Screening Feature Pyramid Network），以增强多尺度特征的融合能力；最后，设计IF-MPDIoU （Inner-Focaler-Minimum Point Distance based Intersection over Union）损失函数，以通过调整变量因子，聚焦关键样本，并增强泛化能力。实验结果表明，在自制数据集和SODA10M数据集上，ITD-YOLOv9算法的检测精度分别为83.8%和56.3%，检测帧率分别为64.8 frame/s和57.4 frame/s。与YOLOv9算法相比，ITD-YOLOv9算法的检测精度分别提升了3.9和2.7个百分点。可见，所提算法有效实现了交通路口的多目标检测。

关键词: YOLOv9, 交通路口检测, 自适应融合, 多目标检测, 深度学习

CLC Number:

TP391.41

Yanhua LIAO, Yuanxia YAN, Wenlin PAN. Multi-target detection algorithm for traffic intersection images based on YOLOv9[J]. Journal of Computer Applications, 2025, 45(8): 2555-2565.

廖炎华, 鄢元霞, 潘文林. 基于YOLOv9的交通路口图像的多目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2555-2565.

Figures/Tables 20

Fig. 1 Network structure of ITD-YOLOv9

Fig. 2 Structure of CoT-CAFRNet

Fig. 3 Module structure of CPB

Fig. 4 Module structure of CARAFE

Fig. 5 Module structure of iCAFF

Fig. 6 Schematic diagram of channel shuffle mechanism

Fig. 7 Module structure of iAFF

Fig. 8 Structure of BiHS-FPN

Fig. 9 Module structure of CA and SSF

Tab. 1 Training hyperparameters

参数	设置	参数	设置
Learning Rate	0.01	Batch Size	8
Image Size	640 $×$ 640	Epoch	100
Momentum	0.937	Weight Decay	0.000 5
Optimizer	SGD

Tab. 1 Training hyperparameters

参数	设置	参数	设置
Learning Rate	0.01	Batch Size	8
Image Size	640 $×$ 640	Epoch	100
Momentum	0.937	Weight Decay	0.000 5
Optimizer	SGD

Tab. 2 Comparison of target detection accuracy between ITD-YOLOv9 and YOLOv9 algorithms

算法	mAP@0.5/%	Precision/%	Recall/%	帧率/（frame·s^-1）	AP@0.5/%
算法	mAP@0.5/%	Precision/%	Recall/%	帧率/（frame·s^-1）	类别0	类别1	类别2	类别3	类别4	类别5	类别6	类别7	类别8
YOLOv9	79.9	85.1	74.8	69.6	80.4	87.2	76.2	89.3	68.4	82.9	66.8	80.8	86.9
ITD-YOLOv9	83.8	88.2	76.6	64.8	83.8	95.5	79.2	94.2	72.8	88.8	69.5	82.7	87.4

Fig. 10 Comparison of experimental results of different algorithms

Tab. 3 Comparison experiment results of image enhancement networks

图像增强网络	mAP@0.5/%	帧率/（frame·s^-1）
YOLOv9（baseline）	79.9	69.6
+Retinexformer	80.5	60.8
+CPA-Enhancer	81.3	63.2
+CoT-CAFRNet	81.6	62.9

Tab. 4 Comparison experiment results of feature pyramids

特征金字塔	mAP@0.5/%	帧率/（frame·s^-1）
PANet（baseline）	79.9	69.6
+BiFPN	80.6	79.1
+HS-FPN	79.7	87.3
+BiHS-FPN	82.0	73.4

Tab. 5 Comparison experiment results of loss functions

损失函数	mAP@0.5/%	帧率/（frame·s^-1）
CIoU（baseline）	79.9	69.6
SIoU	79.3	74.5
MPDIoU	80.7	68.5
IF-MPDIoU	81.5	68.1

Tab. 6 Comparison experiment results of loss function adjustment factors

调节因子	mAP@0.5/%
CIoU（baseline）	79.9
ratio=0.7， d=0， u=0.95	80.9
ratio=1.0， d=0， u=0.95	81.2
ratio=1.3， d=0， u=0.95	80.8
ratio=1.0， d=0， u=0.98	81.4
ratio=1.0， d=0， u=0.92	81.5

Tab. 7 Ablation experiment results

方法改进	CoT-CAFRNet	BiHS-FPN	iCAFF	IF-MPDIoU	mAP@0.5/%	帧率/（frame·s^-1）
1					79.9	69.6
2	√				81.6	62.9
3		√			82.0	73.4
4			√		80.5	68.4
5				√	81.5	68.1
6	√	√			82.7	66.9
7	√		√		81.9	61.8
8		√	√		82.4	72.5
9	√	√	√		83.0	65.9
10	√	√	√	√	83.8	64.8

Tab. 8 Comparison experiment results

算法	帧率/（frame·s^-1）	mAP@0.5/%
Faster R-CNN^［7］	18.1	63.2
SSD^［9］	38.1	64.8
YOLOv5s^［29］	60.8	65.3
YOLOv7^［29］	32.3	68.8
YOLOv8s^［29］	62.4	70.7
YOLOv9^［19］	69.6	79.9
Dynamic R-CNN^［30］	31.2	51.5
Cascade R-CNN^［31］	26.8	57.0
Deformable DETR^［32］	23.3	74.9
RTMDet^［33］	17.4	80.2
ITD-YOLOv9	64.8	83.8

Fig. 11 Curves of different algorithms in mAP@0.5

Tab. 9 Comparison experiment results of generalization

算法	帧率/（frame·s^-1）	mAP@0.5/%
YOLOv9	63.7	53.6
Dynamic R-CNN	28.6	28.4
Cascade R-CNN	25.8	31.2
Deformable DETR	20.2	39.4
RTMDet	16.1	43.2
ITD-YOLOv9	57.4	56.3

References 33

[1]	肖雨晴，杨慧敏. 目标检测算法在交通场景中应用综述［J］. 计算机工程与应用， 2021， 57（6）：30-41.
	XIAO Y Q， YANG H M. Research on application of object detection algorithm in traffic scene［J］. Computer Engineering and Applications， 2021， 57（6）：30-41.
[2]	VIOLA P， JONES M. Rapid object detection using a boosted cascade of simple features［C］// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway： IEEE， 2001： 1-9.
[3]	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway： IEEE， 2005： 886-893.
[4]	LOWE D G. Distinctive image features from scale-invariant key points［J］. International Journal of Computer Vision， 2004， 60（2）： 91-110.
[5]	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587.
[6]	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448.
[7]	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149.
[8]	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
[9]	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
[10]	LIAN J， YIN Y， LI L， et al. Small object detection in traffic scenes based on attention feature fusion［J］. Sensors， 2021， 21（9）： No.3031.
[11]	王译崧，华杭波，孔明，等. Rep-YOLOv8车辆行人检测分割算法［J］.现代电子技术， 2024， 47（9）：143-149.
	WANG Y S， HUA H B， KONG M， et al. Rep-YOLOv8 vehicle and pedestrian detection segmentation algorithm［J］. Modern Electronic Technology， 2024， 47（9）： 143-149.
[12]	单慧琳，吕宗奎，付相为，等. 改进YOLOv5s的交通多目标检测方法［J］. 国外电子测量技术， 2023， 42（4）：8-15.
	SHAN H L， LYU Z K， FU X W， et al. Traffic multi-target detection method of YOLOv5s is improved［J］. Foreign Electronic Measurement Technology， 2023， 42（4）： 8-15.
[13]	YUAN J， BARMPOUTIS P， STATHAKI T. Multi-scale deformable transformer encoder based single-stage pedestrian detection［C］// Proceedings of the 2022 IEEE International Conference on Image Processing. Piscataway： IEEE， 2022： 2906-2910.
[14]	LI N， BAI X， SHEN X， et al. Dense pedestrian detection based on GR-YOLO［J］. Sensors， 2024， 24（14）： No.4747.
[15]	刘辉，刘鑫满，刘大东. 面向复杂道路目标检测的YOLOv5算法优化研究［J］. 计算机工程与应用， 2023， 59（18）：207-217.
	LIU H， LIU X M， LIU D D. Research on optimization of YOLOv5 detection algorithm for object in complex road［J］. Computer Engineering and Applications， 2023， 59（18）： 207-217.
[16]	HUANG L， HUANG W. RD-YOLO： an effective and efficient object detector for roadside perception system［J］. Sensors， 2022， 22（21）： No.8097.
[17]	孔烜，彭佳强，张杰，等. 面向低光照环境的车辆目标检测方法［J］. 湖南大学学报（自然科学版）， 2025， 52（1）： 187-195.
	KONG X， PENG J Q， ZHANG J， et al. Vehicle object detection method for low-light environment［J］. Journal of Hunan University （Natural Sciences）， 2025， 52（1）： 187-195.
[18]	YI K， LUO K， CHEN T， et al. An improved YOLOX model and domain transfer strategy for nighttime pedestrian and vehicle detection［J］. Applied Sciences， 2022， 12（23）： No.12476.
[19]	WANG C Y， YEH I H， LIAO H Y M. YOLOv9： learning what you want to learn using programmable gradient information［C］// Proceedings of the 2024 European Conference on Computer Vision， LNCS 15089. Cham： Springer， 2025： 1-21.
[20]	ZHANG Y， WU Y， LIU Y， et al. CPA-Enhancer： chain-of-thought prompted adaptive enhancer for object detection under unknown degradations［C］// Proceedings of the 2025 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2025： 1-5.
[21]	WANG J， CHEN K， XU R， et al. CARAFE： content-aware reassembly of features［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3007-3016.
[22]	ZHANG X， ZHOU X， LIN M， et al. ShuffleNet： an extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6848-6856.
[23]	DAI Y， GIESEKE F， OEHMCKE S， et al. Attentional feature fusion［C］// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2021： 3559-3568.
[24]	TAN M， PANG R， LE Q V. EfficientDet： scalable and efficient object detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787.
[25]	CHEN Y， ZHANG C， CHEN B， et al. Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases［J］. Computers in Biology and Medicine， 2024， 170： No.107917.
[26]	MA S， XU Y. MPDIoU： a loss for efficient and accurate bounding box regression［EB/OL］. ［2024-06-14］..
[27]	ZHANG H， XU C， ZHANG S. Inner-IoU： more effective intersection over union loss with auxiliary bounding box［EB/OL］. ［2024-06-14］..
[28]	ZHANG H， ZHANG S. Focaler-IoU： more focused intersection over union loss［EB/OL］. ［2024-06-19］..
[29]	TERVEN J， CÓRDOVA-ESPARZA D M， ROMERO-GONZÁLEZ J A. A comprehensive review of YOLO architectures in computer vision： from YOLOv1 to YOLOv8 and YOLO-NAS［J］. Machine Learning and Knowledge Extraction， 2023， 5（4）： 1680-1716.
[30]	ZHANG H， CHANG H， MA B， et al. Dynamic R-CNN： towards high quality object detection via dynamic training［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12360. Cham： Springer， 2020： 260-275.
[31]	CAI Z， VASCONCELOS N. Cascade R-CNN： delving into high quality object detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6154-6162.
[32]	ZHU X， SU W， LU L， et al. Deformable DETR： deformable transformers for end-to-end object detection［EB/OL］. ［2023-03-18］..
[33]	LYU C， ZHANG W， HUANG H， et al. RTMDet： an empirical study of designing real-time object detectors［EB/OL］. ［2023-12-16］..

Multi-target detection algorithm for traffic intersection images based on YOLOv9

基于YOLOv9的交通路口图像的多目标检测算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 20

References 33

Related Articles 15

Recommended Articles

Metrics

[1]	Lina GE, Mingyu WANG, Lei TIAN. Review of research on efficiency of federated learning [J]. Journal of Computer Applications, 2025, 45(8): 2387-2398.
[2]	Peng PENG, Ziting CAI, Wenling LIU, Caihua CHEN, Wei ZENG, Baolai HUANG. Speech emotion recognition method based on hybrid Siamese network with CNN and bidirectional GRU [J]. Journal of Computer Applications, 2025, 45(8): 2515-2521.
[3]	Shuo ZHANG, Guokai SUN, Yuan ZHUANG, Xiaoyu FENG, Jingzhi WANG. Dynamic detection method of eclipse attacks for blockchain node analysis [J]. Journal of Computer Applications, 2025, 45(8): 2428-2436.
[4]	Jinxian SUO, Liping ZHANG, Sheng YAN, Dongqi WANG, Yawen ZHANG. Review of interpretable deep knowledge tracing methods [J]. Journal of Computer Applications, 2025, 45(7): 2043-2055.
[5]	Zhenzhou WANG, Fangfang GUO, Jingfang SU, He SU, Jianchao WANG. Robustness optimization method of visual model for intelligent inspection [J]. Journal of Computer Applications, 2025, 45(7): 2361-2368.
[6]	Qiaoling QI, Xiaoxiao WANG, Qianqian ZHANG, Peng WANG, Yongfeng DONG. Label noise adaptive learning algorithm based on meta-learning [J]. Journal of Computer Applications, 2025, 45(7): 2113-2122.
[7]	Xiaoyang ZHAO, Xinzheng XU, Zhongnian LI. Research review on explainable artificial intelligence in internet of things applications [J]. Journal of Computer Applications, 2025, 45(7): 2169-2179.
[8]	Tianchen HUA, Xiaoning MA, Hui ZHI. Portable executable malware static detection model based on shallow artificial neural network [J]. Journal of Computer Applications, 2025, 45(6): 1911-1921.
[9]	Lanhao LI, Haojun YAN, Haoyi ZHOU, Qingyun SUN, Jianxin LI. Multi-scale information fusion time series long-term forecasting model based on neural network [J]. Journal of Computer Applications, 2025, 45(6): 1776-1783.
[10]	Dan WANG, Wenhao ZHANG, Lijuan PENG. Channel estimation of reconfigurable intelligent surface assisted communication system based on deep learning [J]. Journal of Computer Applications, 2025, 45(5): 1613-1618.
[11]	Sijie NIU, Yuliang LIU. Auxiliary diagnostic method for retinopathy based on dual-branch structure with knowledge distillation [J]. Journal of Computer Applications, 2025, 45(5): 1410-1414.
[12]	Kai CHEN, Hailiang YE, Feilong CAO. Classification algorithm for point cloud based on local-global interaction and structural Transformer [J]. Journal of Computer Applications, 2025, 45(5): 1671-1676.
[13]	Wenpeng WANG, Yinchang QIN, Wenxuan SHI. Review of unsupervised deep learning methods for industrial defect detection [J]. Journal of Computer Applications, 2025, 45(5): 1658-1670.
[14]	Xueying LI, Kun YANG, Guoqing TU, Shubo LIU. Adversarial sample generation method for time-series data based on local augmentation [J]. Journal of Computer Applications, 2025, 45(5): 1573-1581.
[15]	Lihu PAN, Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO. Video anomaly detection for moving foreground regions [J]. Journal of Computer Applications, 2025, 45(4): 1300-1309.