Small target detection algorithm for train operating environment image based on improved YOLOv3

doi:10.11772/j.issn.1001-9081.2022091343

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (8): 2611-2618.DOI: 10.11772/j.issn.1001-9081.2022091343

• Multimedia computing and computer simulation • Previous Articles Next Articles

Small target detection algorithm for train operating environment image based on improved YOLOv3

Meijia LIANG¹(), Xinwu LIU², Xiaopeng HU¹^,³

^1.School of Computing and Artificial Intelligence，Southwest Jiaotong University，Chengdu Sichuan 611756，China
^2.Data and Intelligence Technology Center，Zhuzhou CRRC Times Electric Company Limited，Zhuzhou Hunan 412001，China
^3.Tangshan Institute，Southwest Jiaotong University，Tangshan Hebei 063010，China

Received:2022-09-15 Revised:2022-12-20 Accepted:2023-01-05 Online:2023-03-02 Published:2023-08-10
Contact: Meijia LIANG
About author:LIU Xinwu， born in 1989， M. S.， assistant engineer. His research interests include big data analysis.
HU Xiaopeng， born in 1972， M. S.， associate professor. His research interests include image processing， intelligent system.
Supported by:
Hebei Provincial Natural Science Foundation(F2022105033)

基于改进YOLOv3的列车运行环境图像小目标检测算法

梁美佳¹(), 刘昕武², 胡晓鹏¹^,³

^1.西南交通大学计算机与人工智能学院，成都 611756
^2.株洲中车时代电气股份有限公司数据与智能技术中心，湖南株洲 412001
^3.西南交通大学唐山研究院，河北唐山 063010

通讯作者: 梁美佳
作者简介:刘昕武（1989—），男，湖南株洲人，助理工程师，硕士，主要研究方向：大数据分析
胡晓鹏（1972—），男，陕西汉中人，副教授，硕士，主要研究方向：图像处理、智能系统。
基金资助:
河北省自然科学基金资助项目(F2022105033)

Abstract

Abstract:

Train assisted driving depends on the real-time detection of train operating environment. There are abundant small targets in the images of train operating environment. Compared with large and medium targets， small targets with the proportion of less than 1% of original image have problems of high missed detection and poor detection accuracy due to low resolution. Therefore， a target detection algorithm based on improved YOLOv3 in train operating environment was proposed， namely YOLOv3-TOEI （YOLOv3-Train Operating Environment Image）. Firstly， k-means clustering algorithm was used to optimize the anchor to speed up the convergence of the network. Then， dilated convolution was embedded in DarkNet-53 to expand the receptive field， and Dense convolutional Network （DenseNet） was introduced to obtain richer low-level details of the image. Finally， the unidirectional feature fusion structure of original YOLOv3 was improved to bidirectional and adaptive feature fusion structure， which realized the effective combination of deep and shallow features and improved the detection effect of the network on multi-scale targets （especially small targets）. Experimental results show that compared with original YOLOv3 algorithm， YOLOv3-TOEI algorithm has the mean Average Precision （mAP）@0.5 reached 84.5%， which increased by 12.2%， and the Frames Per Second （FPS） of 83， verifying that this algorithm has better detection ability of small targets in images of train operating environment.

Key words: train assisted driving, small target detection, dilated convolution, Dense convolutional Network (DenseNet), feature fusion, channel attention mechanism

摘要：

列车辅助驾驶离不开对列车运行环境的实时检测，而列车运行环境图像存在丰富的小目标。与大中型目标相比，目标占原图比例小于1%的小目标由于分辨率低而存在误检率高、检测精度较差的问题，因此提出一种基于改进YOLOv3的列车运行环境目标检测算法YOLOv3-TOEI （YOLOv3-Train Operating Environment Image）。首先，利用k-means聚类算法优化anchor，从而提高网络的收敛速度；然后，在DarkNet-53中嵌入空洞卷积以增大感受野，并引入稠密卷积网络（DenseNet）获取更丰富的图像底层细节信息；最后，将原始YOLOv3的单向特征融合结构改进为双向自适应特征融合结构，从而实现深浅层特征的有效结合，并提高网络对多尺度目标（特别是小目标）的检测效果。实验结果表明，与原YOLOv3算法相比，YOLOv3-TOEI算法的平均精度均值（mAP）@0.5达到84.5%，提升了12.2%，每秒传输帧数（FPS）为83，拥有更好的列车运行环境图像小目标检测能力。

关键词: 列车辅助驾驶, 小目标检测, 空洞卷积, 稠密卷积网络, 特征融合, 通道注意力机制

CLC Number:

TP391.41

Meijia LIANG, Xinwu LIU, Xiaopeng HU. Small target detection algorithm for train operating environment image based on improved YOLOv3[J]. Journal of Computer Applications, 2023, 43(8): 2611-2618.

梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618.

Figures/Tables 22

Fig. 1 Train operating environment image

Fig. 2 Structure of YOLOv3

Fig. 3 Sample of dataset

Fig. 4 Dilated convolutions with different dilation rates

Fig. 5 Structure of DenseNet

Fig. 6 Structure of YOLOv3-dia-dense backbone network

Fig. 7 Structure of FPN

Fig. 8 Structure of BIFPN

Fig. 9 Structure of ACFF

Fig. 10 Structure of YOLOv3-bi-acff network

Tab. 1 Specific up-sampling or down-sampling operations for different input feature layers

输入特征层	操作	上/下采样倍数	输出特征图
256×64×64	下采样	4	256×16×16
512×32×32	下采样	2	512×16×16
1 024×16×16	—	—	1 024×16×16

Fig. 11 Proportion distribution of target to original image size in training set

Fig. 12 Proportion distribution of different targets to original image size in training set

Tab. 2 Information of training set and test set

类别	训练集样本数	测试集样本数
共计	26 164	6 491
person	5 861	1 413
locomotive	8 141	2 141
car	284	89
signal_light	5 758	1 481
truck	195	39
sign	1 138	275
obstacle	4 787	1 053

Fig.13 Loss curve of YOLOv3-TOEI algorithm

Fig. 14 IAIoU curve with different k value

Fig. 15 Cluster center distribution

Fig. 16 Comparison of improved YOLOv3 backbone network algorithms under different numbers of dense block units

Tab. 3 Comparison of experimental results of module ablation

空洞卷积	稠密模块	特征融合	k-means	mAP@0.5	F1	FPS
—	—	—	—	0.753	0.777	145
—	—	—	√	0.802	0.823	145
√	—	—	—	0.763	0.786	136
√	√	—	—	0.780	0.796	131
—	—	√	—	0.779	0.798	85
√	√	√	—	0.792	0.808	83
√	√	√	√	0.845	0.861	83

Tab. 4 Comparison of each category AP between YOLOv3 and YOLOv3-TOEI

类别	YOLOv3	YOLOv3-TOEI	类别	YOLOv3	YOLOv3-TOEI
person	0.838	0.880	truck	0.928	0.951
locomotive	0.957	0.963	sign	0.570	0.779
car	0.880	0.913	obstacle	0.528	0.680
signal_light	0.571	0.745

Fig. 17 Comparison of target detection results between YOLOv3 and YOLOv3-TOEI

Tab. 5 Comprehensive comparison of YOLOv3-TOEI and other algorithms

算法	mAP@0.5	F1	FPS
YOLOv3	0.753	0.777	145
YOLOv3-spp	0.748	0.772	145
YOLOv3-tiny	0.602	0.632	431
YOLOv4	0.775	0.710	64
Faster RCNN	0.856	0.876	21
SSD	0.719	0.752	109
YOLOv3-TOEI	0.845	0.861	83

References 24

1	李柯泉，陈燕，刘佳晨，等. 基于深度学习的目标检测算法综述［J］. 计算机程， 2022， 48（7）：1-12. 10.19678/j.issn.1000-3428.0062725
	LI K Q， CHEN Y， LIU J C， et al. Survey of deep learning-based object detection algorithms［J］. Computer Engineering， 2022， 48（7）：1-12. 10.19678/j.issn.1000-3428.0062725
2	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014：580-587. 10.1109/cvpr.2014.81
3	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015：1440-1448. 10.1109/iccv.2015.169
4	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015， 1：91-99.
5	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2022-08-06］.. 10.1109/cvpr.2017.690
6	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2022-08-06］..
7	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multiBox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016：21-37.
8	熊群芳，林军，刘悦，等. 深度学习研究现状及其在轨道交通领域的应用［J］. 控制与信息技术， 2018（2）：1-6.
	XIONG Q F， LIN J， LIU Y， et al. Deep learning and its application in the field of rail transit［J］. Control and Information Technology， 2018（2）：1-6.
9	WANG H M， PEI H Y， ZHANG J B. Detection of locomotive signal lights and pedestrians on railway tracks using improved YOLOv4［J］. IEEE Access， 2022， 10：15495-15505. 10.1109/access.2022.3148182
10	王焕民，张建柏，裴华艳，等. 基于MobileNet-SSD的铁路信号灯检测识别［J］. 兰州交通大学学报， 2020， 39（4）：66-70. 10.3969/j.issn.1001-4373.2020.04.010
	WANG H M， ZHANG J B， PEI H Y， et al. Research on railway signal lamp detection based on MobileNet-SSD［J］. Journal of Lanzhou Jiaotong University， 2020， 39（4）：66-70. 10.3969/j.issn.1001-4373.2020.04.010
11	LU Y D， LI J Y， WANG X T. Abnormal detection of track fastener based on improved YOLOV3 algorithm［C］// Proceedings of the 5th International Conference on Mechanical， Control and Computer Engineering. Piscataway： IEEE， 2020： 1826-1831. 10.1109/icmcce51767.2020.00401
12	YE T， WANG B C， SONG P， et al. Automatic railway traffic object detection system using feature fusion refine neural network under shunting mode［J］. Sensors， 2018， 18（6）： No.1916. 10.3390/s18061916
13	KARAGIANNIS G， OLSEN S， PEDERSEN K. Deep learning for detection of railway signs and signals［C］// Proceedings of the 2019 Science and Information Conference， AISC 943. Cham： Springer， 2020： 1-15. 10.1007/978-3-030-17795-9_1
14	YE T， ZHANG X， ZHANG Y， et al. Railway traffic object detection using differential feature fusion convolution neural network［J］. IEEE Transactions on Intelligent Transportation Systems， 2021， 22（3）： 1375-1387. 10.1109/tits.2020.2969993
15	冯号，黄朝兵，文元桥. 基于改进YOLOv3的遥感图像小目标检测［J］. 计算机应用， 2022， 42（12）：3723-3732.
	FENG H， HUANG C B， WEN Y Q. Remote sensing image small target detection based on improved YOLOv3［J］. Journal of Computer Applications， 2022， 42（12）：3723-3732.
16	HUANG G， LIU Z， L van der MAATEN， et al. Densely connected convolutional networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2261-2269. 10.1109/cvpr.2017.243
17	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944. 10.1109/cvpr.2017.106
18	TAN M X， PANG R M， LE Q V. EfficientDet： scalable and efficient object detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787. 10.1109/cvpr42600.2020.01079
19	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018：7132-7141. 10.1109/cvpr.2018.00745
20	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
21	YU F， KOLTUN V. Multi-scale context aggregation by dilated convolutions［EB/OL］. （2016-04-30）［2022-07-21］.. 10.1109/cvpr.2017.75
22	曹帅帅. 基于改进YOLOv3的口罩佩戴检测模型研究［D］. 青岛：青岛大学， 2021：45-52. 10.1109/cisai54367.2021.00044
	CAO S S. Research on mask wearing detection model based on improved YOLOv3［D］. Qingdao： Qingdao University， 2021：45-52. 10.1109/cisai54367.2021.00044
23	LIU S， QI L， QIN H F， et al. Path aggregation network for instance segmentation［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8759-8768. 10.1109/cvpr.2018.00913
24	GHIASI G， LIN T Y， LE Q V. NAS-FPN： learning scalable feature pyramid architecture for object detection［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7029-7038. 10.1109/cvpr.2019.00720

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[3]	Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257.
[4]	Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977.
[5]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[6]	Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604.
[7]	Jun FENG, Jiankang BI, Yiru HUO, Jiakuan LI. PIPNet： lightweight asphalt pavement crack image segmentation network [J]. Journal of Computer Applications, 2024, 44(5): 1520-1526.
[8]	Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444.
[9]	Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628.
[10]	Xinyuan YOU, Heng WANG. Monaural speech enhancement based on gated dilated convolutional recurrent network [J]. Journal of Computer Applications, 2024, 44(4): 1317-1324.
[11]	Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937.
[12]	Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944.
[13]	Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744.
[14]	Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751.
[15]	Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721.

Small target detection algorithm for train operating environment image based on improved YOLOv3

基于改进YOLOv3的列车运行环境图像小目标检测算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 22

References 24

Related Articles 15

Recommended Articles

Metrics