基于改进YOLOv3的列车运行环境图像小目标检测算法

doi:10.11772/j.issn.1001-9081.2022091343

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2611-2618.DOI: 10.11772/j.issn.1001-9081.2022091343

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于改进YOLOv3的列车运行环境图像小目标检测算法

梁美佳¹(), 刘昕武², 胡晓鹏¹^,³

^1.西南交通大学计算机与人工智能学院，成都 611756
^2.株洲中车时代电气股份有限公司数据与智能技术中心，湖南株洲 412001
^3.西南交通大学唐山研究院，河北唐山 063010

收稿日期:2022-09-15 修回日期:2022-12-20 接受日期:2023-01-05 发布日期:2023-03-02 出版日期:2023-08-10
通讯作者: 梁美佳
作者简介:刘昕武（1989—），男，湖南株洲人，助理工程师，硕士，主要研究方向：大数据分析
胡晓鹏（1972—），男，陕西汉中人，副教授，硕士，主要研究方向：图像处理、智能系统。
基金资助:
河北省自然科学基金资助项目(F2022105033)

Small target detection algorithm for train operating environment image based on improved YOLOv3

Meijia LIANG¹(), Xinwu LIU², Xiaopeng HU¹^,³

^1.School of Computing and Artificial Intelligence，Southwest Jiaotong University，Chengdu Sichuan 611756，China
^2.Data and Intelligence Technology Center，Zhuzhou CRRC Times Electric Company Limited，Zhuzhou Hunan 412001，China
^3.Tangshan Institute，Southwest Jiaotong University，Tangshan Hebei 063010，China

Received:2022-09-15 Revised:2022-12-20 Accepted:2023-01-05 Online:2023-03-02 Published:2023-08-10
Contact: Meijia LIANG
About author:LIU Xinwu， born in 1989， M. S.， assistant engineer. His research interests include big data analysis.
HU Xiaopeng， born in 1972， M. S.， associate professor. His research interests include image processing， intelligent system.
Supported by:
Hebei Provincial Natural Science Foundation(F2022105033)

摘要/Abstract

摘要：

列车辅助驾驶离不开对列车运行环境的实时检测，而列车运行环境图像存在丰富的小目标。与大中型目标相比，目标占原图比例小于1%的小目标由于分辨率低而存在误检率高、检测精度较差的问题，因此提出一种基于改进YOLOv3的列车运行环境目标检测算法YOLOv3-TOEI （YOLOv3-Train Operating Environment Image）。首先，利用k-means聚类算法优化anchor，从而提高网络的收敛速度；然后，在DarkNet-53中嵌入空洞卷积以增大感受野，并引入稠密卷积网络（DenseNet）获取更丰富的图像底层细节信息；最后，将原始YOLOv3的单向特征融合结构改进为双向自适应特征融合结构，从而实现深浅层特征的有效结合，并提高网络对多尺度目标（特别是小目标）的检测效果。实验结果表明，与原YOLOv3算法相比，YOLOv3-TOEI算法的平均精度均值（mAP）@0.5达到84.5%，提升了12.2%，每秒传输帧数（FPS）为83，拥有更好的列车运行环境图像小目标检测能力。

关键词: 列车辅助驾驶, 小目标检测, 空洞卷积, 稠密卷积网络, 特征融合, 通道注意力机制

Abstract:

Train assisted driving depends on the real-time detection of train operating environment. There are abundant small targets in the images of train operating environment. Compared with large and medium targets， small targets with the proportion of less than 1% of original image have problems of high missed detection and poor detection accuracy due to low resolution. Therefore， a target detection algorithm based on improved YOLOv3 in train operating environment was proposed， namely YOLOv3-TOEI （YOLOv3-Train Operating Environment Image）. Firstly， k-means clustering algorithm was used to optimize the anchor to speed up the convergence of the network. Then， dilated convolution was embedded in DarkNet-53 to expand the receptive field， and Dense convolutional Network （DenseNet） was introduced to obtain richer low-level details of the image. Finally， the unidirectional feature fusion structure of original YOLOv3 was improved to bidirectional and adaptive feature fusion structure， which realized the effective combination of deep and shallow features and improved the detection effect of the network on multi-scale targets （especially small targets）. Experimental results show that compared with original YOLOv3 algorithm， YOLOv3-TOEI algorithm has the mean Average Precision （mAP）@0.5 reached 84.5%， which increased by 12.2%， and the Frames Per Second （FPS） of 83， verifying that this algorithm has better detection ability of small targets in images of train operating environment.

Key words: train assisted driving, small target detection, dilated convolution, Dense convolutional Network (DenseNet), feature fusion, channel attention mechanism

中图分类号:

TP391.41

梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 计算机应用, 2023, 43(8): 2611-2618.

Meijia LIANG, Xinwu LIU, Xiaopeng HU. Small target detection algorithm for train operating environment image based on improved YOLOv3[J]. Journal of Computer Applications, 2023, 43(8): 2611-2618.

图/表 22

图1 列车运行环境图像

Fig. 1 Train operating environment image

图2 YOLOv3的结构

Fig. 2 Structure of YOLOv3

图3 数据集样例

Fig. 3 Sample of dataset

图4 不同扩张率下的空洞卷积

Fig. 4 Dilated convolutions with different dilation rates

图5 DenseNet的结构

Fig. 5 Structure of DenseNet

图6 YOLOv3-dia-dense主干网络结构

Fig. 6 Structure of YOLOv3-dia-dense backbone network

图7 FPN结构

Fig. 7 Structure of FPN

图8 BIFPN结构

Fig. 8 Structure of BIFPN

图9 ACFF的结构

Fig. 9 Structure of ACFF

图10 YOLOv3-bi-acff网络结构

Fig. 10 Structure of YOLOv3-bi-acff network

表1 不同输入特征层的具体上/下采样操作

Tab. 1 Specific up-sampling or down-sampling operations for different input feature layers

输入特征层	操作	上/下采样倍数	输出特征图
256×64×64	下采样	4	256×16×16
512×32×32	下采样	2	512×16×16
1 024×16×16	—	—	1 024×16×16

图11 训练集中目标占原图像大小的比例分布

Fig. 11 Proportion distribution of target to original image size in training set

图12 训练集中不同类别目标占原图像大小的比例分布

Fig. 12 Proportion distribution of different targets to original image size in training set

表2 训练集和测试集信息

Tab. 2 Information of training set and test set

类别	训练集样本数	测试集样本数
共计	26 164	6 491
person	5 861	1 413
locomotive	8 141	2 141
car	284	89
signal_light	5 758	1 481
truck	195	39
sign	1 138	275
obstacle	4 787	1 053

图13 YOLOv3-TOEI算法的损失曲线

Fig.13 Loss curve of YOLOv3-TOEI algorithm

图14 不同k值的IAIoU曲线

Fig. 14 IAIoU curve with different k value

图15 聚类中心分布

Fig. 15 Cluster center distribution

图16 不同稠密模块单元数下改进YOLOv3主干网络算法对比

Fig. 16 Comparison of improved YOLOv3 backbone network algorithms under different numbers of dense block units

表3 模块消融实验结果对比

Tab. 3 Comparison of experimental results of module ablation

空洞卷积	稠密模块	特征融合	k-means	mAP@0.5	F1	FPS
—	—	—	—	0.753	0.777	145
—	—	—	√	0.802	0.823	145
√	—	—	—	0.763	0.786	136
√	√	—	—	0.780	0.796	131
—	—	√	—	0.779	0.798	85
√	√	√	—	0.792	0.808	83
√	√	√	√	0.845	0.861	83

表4 YOLOv3和YOLOv3-TOEI的各类精度比较

Tab. 4 Comparison of each category AP between YOLOv3 and YOLOv3-TOEI

类别	YOLOv3	YOLOv3-TOEI	类别	YOLOv3	YOLOv3-TOEI
person	0.838	0.880	truck	0.928	0.951
locomotive	0.957	0.963	sign	0.570	0.779
car	0.880	0.913	obstacle	0.528	0.680
signal_light	0.571	0.745

图17 YOLOv3与YOLOv3-TOEI的目标检测结果对比

Fig. 17 Comparison of target detection results between YOLOv3 and YOLOv3-TOEI

表5 YOLOv3-TOEI与其他算法的综合比较

Tab. 5 Comprehensive comparison of YOLOv3-TOEI and other algorithms

算法	mAP@0.5	F1	FPS
YOLOv3	0.753	0.777	145
YOLOv3-spp	0.748	0.772	145
YOLOv3-tiny	0.602	0.632	431
YOLOv4	0.775	0.710	64
Faster RCNN	0.856	0.876	21
SSD	0.719	0.752	109
YOLOv3-TOEI	0.845	0.861	83

参考文献 24

1	李柯泉，陈燕，刘佳晨，等. 基于深度学习的目标检测算法综述［J］. 计算机程， 2022， 48（7）：1-12. 10.19678/j.issn.1000-3428.0062725
	LI K Q， CHEN Y， LIU J C， et al. Survey of deep learning-based object detection algorithms［J］. Computer Engineering， 2022， 48（7）：1-12. 10.19678/j.issn.1000-3428.0062725
2	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014：580-587. 10.1109/cvpr.2014.81
3	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015：1440-1448. 10.1109/iccv.2015.169
4	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015， 1：91-99.
5	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2022-08-06］.. 10.1109/cvpr.2017.690
6	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2022-08-06］..
7	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multiBox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016：21-37.
8	熊群芳，林军，刘悦，等. 深度学习研究现状及其在轨道交通领域的应用［J］. 控制与信息技术， 2018（2）：1-6.
	XIONG Q F， LIN J， LIU Y， et al. Deep learning and its application in the field of rail transit［J］. Control and Information Technology， 2018（2）：1-6.
9	WANG H M， PEI H Y， ZHANG J B. Detection of locomotive signal lights and pedestrians on railway tracks using improved YOLOv4［J］. IEEE Access， 2022， 10：15495-15505. 10.1109/access.2022.3148182
10	王焕民，张建柏，裴华艳，等. 基于MobileNet-SSD的铁路信号灯检测识别［J］. 兰州交通大学学报， 2020， 39（4）：66-70. 10.3969/j.issn.1001-4373.2020.04.010
	WANG H M， ZHANG J B， PEI H Y， et al. Research on railway signal lamp detection based on MobileNet-SSD［J］. Journal of Lanzhou Jiaotong University， 2020， 39（4）：66-70. 10.3969/j.issn.1001-4373.2020.04.010
11	LU Y D， LI J Y， WANG X T. Abnormal detection of track fastener based on improved YOLOV3 algorithm［C］// Proceedings of the 5th International Conference on Mechanical， Control and Computer Engineering. Piscataway： IEEE， 2020： 1826-1831. 10.1109/icmcce51767.2020.00401
12	YE T， WANG B C， SONG P， et al. Automatic railway traffic object detection system using feature fusion refine neural network under shunting mode［J］. Sensors， 2018， 18（6）： No.1916. 10.3390/s18061916
13	KARAGIANNIS G， OLSEN S， PEDERSEN K. Deep learning for detection of railway signs and signals［C］// Proceedings of the 2019 Science and Information Conference， AISC 943. Cham： Springer， 2020： 1-15. 10.1007/978-3-030-17795-9_1
14	YE T， ZHANG X， ZHANG Y， et al. Railway traffic object detection using differential feature fusion convolution neural network［J］. IEEE Transactions on Intelligent Transportation Systems， 2021， 22（3）： 1375-1387. 10.1109/tits.2020.2969993
15	冯号，黄朝兵，文元桥. 基于改进YOLOv3的遥感图像小目标检测［J］. 计算机应用， 2022， 42（12）：3723-3732.
	FENG H， HUANG C B， WEN Y Q. Remote sensing image small target detection based on improved YOLOv3［J］. Journal of Computer Applications， 2022， 42（12）：3723-3732.
16	HUANG G， LIU Z， L van der MAATEN， et al. Densely connected convolutional networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2261-2269. 10.1109/cvpr.2017.243
17	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944. 10.1109/cvpr.2017.106
18	TAN M X， PANG R M， LE Q V. EfficientDet： scalable and efficient object detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787. 10.1109/cvpr42600.2020.01079
19	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018：7132-7141. 10.1109/cvpr.2018.00745
20	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
21	YU F， KOLTUN V. Multi-scale context aggregation by dilated convolutions［EB/OL］. （2016-04-30）［2022-07-21］.. 10.1109/cvpr.2017.75
22	曹帅帅. 基于改进YOLOv3的口罩佩戴检测模型研究［D］. 青岛：青岛大学， 2021：45-52. 10.1109/cisai54367.2021.00044
	CAO S S. Research on mask wearing detection model based on improved YOLOv3［D］. Qingdao： Qingdao University， 2021：45-52. 10.1109/cisai54367.2021.00044
23	LIU S， QI L， QIN H F， et al. Path aggregation network for instance segmentation［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8759-8768. 10.1109/cvpr.2018.00913
24	GHIASI G， LIN T Y， LE Q V. NAS-FPN： learning scalable feature pyramid architecture for object detection［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7029-7038. 10.1109/cvpr.2019.00720

[1]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[2]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[3]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[4]	刘瑞华, 郝子赫, 邹洋杨. 基于多层级精细特征融合的步态识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2250-2257.
[5]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[6]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.
[7]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[8]	李鑫, 孟乔, 皇甫俊逸, 孟令辰. 基于分离式标签协同学习的YOLOv5多属性分类[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1619-1628.
[9]	封筠, 毕健康, 霍一儒, 李家宽. 轻量化沥青路面裂缝图像分割网络PIPNet[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1520-1526.
[10]	韩贵金, 张馨渊, 张文涛, 黄娅. 基于多特征融合的自监督图像配准算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1597-1604.
[11]	李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444.
[12]	李新叶, 侯晔凝, 孔英会, 燕志旗. 结合特征融合与增强注意力的少样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 745-751.
[13]	贾宗泽, 高鹏飞, 马应龙, 刘晓峰, 夏海鑫. 基于注意力机制的多特征融合对话行为层次化分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 715-721.
[14]	蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944.
[15]	吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744.

基于改进YOLOv3的列车运行环境图像小目标检测算法

Small target detection algorithm for train operating environment image based on improved YOLOv3

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 22

参考文献 24

相关文章 15

编辑推荐

Metrics