Multi-object tracking algorithm for construction machinery in transmission line scenarios

doi:10.11772/j.issn.1001-9081.2024070985

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2351-2360.DOI: 10.11772/j.issn.1001-9081.2024070985

• Multimedia computing and computer simulation • Previous Articles Next Articles

Multi-object tracking algorithm for construction machinery in transmission line scenarios

Pingping YU¹, Yuting YAN¹, Xinliang TANG¹(), He SU², Jianchao WANG¹

^1.School of Information Science and Engineering，Hebei University of Science and Technology，Shijiazhuang Hebei 050018，China
^2.School of Electrical Engineering，Hebei University of Technology，Tianjin 300130，China

Received:2024-07-15 Revised:2024-10-11 Accepted:2024-10-11 Online:2025-07-10 Published:2025-07-10
Contact: Xinliang TANG
About author:YU Pingping， born in 1984， Ph. D.， associate professor. Her research interests include computer vision， artificial intelligence.
YAN Yuting， born in 1998， M. S. candidate. Her research interests include object detection， object tracking.
SU He， born in 1993， Ph. D. candidate. His research interests include analysis and control of power system， reliability theory and application of electrical equipment.
WANG Jianchao， born in 1990， Ph. D.， lecturer. His research interests include deep learning， artificial intelligence， intelligent information processing.
Supported by:
Youth Fund of Hebei Education Department(QN2023185)

输电线路场景下的施工机械多目标跟踪算法

于平平¹, 闫玉婷¹, 唐心亮¹(), 苏鹤², 王建超¹

^1.河北科技大学信息科学与工程学院，石家庄 050018
^2.河北工业大学电气工程学院，天津 300130

通讯作者: 唐心亮
作者简介:于平平（1984—），女，河北石家庄人，副教授，博士，主要研究方向：计算机视觉、人工智能
闫玉婷（1998—），女，河北邯郸人，硕士研究生，主要研究方向：目标检测、目标跟踪
苏鹤（1993—），男，河北衡水人，博士研究生，主要研究方向：电力系统分析与控制、电工装备可靠性理论及应用
王建超（1990—），男，河北石家庄人，讲师，博士，主要研究方向：深度学习、人工智能、智能信息处理。
基金资助:
河北省教育厅青年基金资助项目(QN2023185)

Abstract

Abstract:

In transmission line inspection tasks， utilizing deep learning technology to track the movement of construction machinery effectively is crucial for smart grid construction. To address the issue of significant performance degradation in multi-object tracking caused by occlusion among targets and false or missed detections， a multi-object tracking algorithm combining improved YOLOv5s and optimized ByteTrack was proposed. In the object detection section： firstly， lightweight Ghost convolution and SimAM were used to construct the SGC3 （SimAM and Ghost convolution with C3） module， thereby improving feature utilization and reducing redundant computations in the algorithm. Secondly， in deeper layers of the backbone network， a convolution-guided triplet attention module R-Triplet （RFAConv with Triplet attention） was proposed， thereby using a multi-branch structure to enhance cross-dimensional information interaction of the algorithm and suppress irrelevant background information to improve object association capability. Finally， in the feature fusion stage， a Multi-branch Receptive Block （MRB） was added， thereby utilizing dilated convolution to expand the receptive field of the object and enhancing reuse of multi-scale global feature information of the object. In the object tracking section： based on ByteTrack algorithm， according to motion characteristics of construction machinery， an NSA （Noise Scale Adaptively） Kalman filter algorithm with adaptive noise scale computation was proposed to decrease the influence of low-quality detection boxes on filtering performance. At the same time， Gaussian Smoothing Interpolation （GSI） algorithm was introduced into the data association process to further optimize multi-object tracking performance. Experimental results indicate that compared to the baseline algorithm YOLOv5s， the proposed CRM-YOLOv5s algorithm achieves mean Average Precision （mAP） of 97.4%， which is improved by 3.8 percentage points with the of parameters and floating-point operations reduced by 0.28×10⁶ and 1.8 GFLOPs， respectively， demonstrating stronger generalization capability in various application scenarios. Additionally， compared to the original YOLOv5s+ByteTrack tracking algorithm， after combining with improved ByteTrack， the proposed CRM-YOLOv5s algorithm has the Multiple Object Tracking Accuracy （MOTA） increased by 4.5 percentage points， the number of Identity switches （IDs） decreased by 15， and higher inference speed， demonstrating that the algorithm is suitable for multi-object tracking task of construction machinery in transmission line scenarios.

Key words: transmission line scenario, object detection, multi-object tracking, YOLOv5s, ByteTrack

摘要：

在输电线路巡检任务中，采用深度学习技术实现施工机械运动的有效跟踪对智能电网建设至关重要。针对目标间遮挡干扰以及误检漏检造成的多目标跟踪性能显著下降的问题，提出一种改进YOLOv5s与优化ByteTrack相结合的多目标跟踪算法。在目标检测部分：首先，采用轻量级的Ghost卷积和SimAM构建SGC3 （SimAM and Ghost convolution with C3）模块，以提高特征利用率，并减少算法冗余计算；其次，在主干网络的深层，提出卷积引导的三重注意力模块R-Triplet（RFAConv with Triplet attention），从而利用多分支结构增强算法跨维度信息交互，并抑制不相关背景信息来提高目标的关联能力；最后，在特征融合部分添加多分支感受野模块（MRB），以利用空洞卷积扩大目标感受野，并增强多尺度目标全局特征信息的复用。在目标跟踪部分：在ByteTrack算法的基础上，根据施工机械的运动特点，提出一种自适应计算噪声尺度的NSA（Noise Scale Adaptively）卡尔曼滤波算法，以降低低质量检测框对滤波算法性能的影响；同时，在数据关联部分引入高斯平滑插值算法（GSI），从而进一步完善多目标跟踪的效果。实验结果表明，所提CRM-YOLOv5s算法的平均精度均值（mAP）达到了97.4%，与基线算法YOLOv5s相比提升了3.8个百分点，参数量和浮点运算量分别减少了0.28×10⁶和1.8 GFLOPs，可见该算法在多种应用场景下的泛化能力更强。此外，相较于原YOLOv5s+ByteTrack跟踪算法，所提CRM-YOLOv5s算法与改进后的ByteTrack算法相结合后的多目标跟踪准确度（MOTA）提升了4.5个百分点，目标身份切换次数（IDs）减少了15，且获得了较高的推理速度，可见该算法适用于输电线路场景下施工机械的多目标跟踪任务。

关键词: 输电线路场景, 目标检测, 多目标跟踪, YOLOv5s, ByteTrack

CLC Number:

TP391.4

Pingping YU, Yuting YAN, Xinliang TANG, He SU, Jianchao WANG. Multi-object tracking algorithm for construction machinery in transmission line scenarios[J]. Journal of Computer Applications, 2025, 45(7): 2351-2360.

于平平, 闫玉婷, 唐心亮, 苏鹤, 王建超. 输电线路场景下的施工机械多目标跟踪算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2351-2360.

Figures/Tables 27

References 24

[1]	WONG S Y， CHOE C W C， GOH H H， et al. Power transmission line fault detection and diagnosis based on artificial intelligence approach and its development in UAV： a review ［J］. Arabian Journal for Science and Engineering， 2021， 46（10）： 9305-9331.
[2]	闫钧华，张琨，施天俊，等.融合多层级特征的遥感图像地面弱小目标检测［J］.仪器仪表学报，2022， 43（3）： 221-229.
	YAN J H， ZHANG K， SHI T J， et al. Multi-level feature fusion based dim small ground target detection in remote sensing images ［J］. Journal of Scientific Instrument， 2022， 43（3）： 221-229.
[3]	郝帅，赵新生，马旭，等.基于TR-YOLOv5的输电线路多类缺陷目标检测方法［J］.图学学报，2023， 44（4）： 667-676.
	HAO S， ZHAO X S， MA X， et al. Multi- class defect target detection method for transmission lines based on TR-YOLOv5s ［J］. Journal of Graphics， 2023， 44（4）： 667-676.
[4]	葛雯，姜添元.改进YOLO与Deepsort检测跟踪算法的研究［J］.计算机仿真，2022， 39（5）： 186-190.
	GE W， JIANG T Y. Research on improved YOLO and Deepsort detection and tracking algorithm ［J］. Computer Simulation， 2022， 39（5）： 186-190.
[5]	黄战华，陈智林，张晗笑，等.基于音视频信息融合的目标检测与跟踪算法［J］.应用光学，2021， 42（5）： 867-876.
	HUANG Z H， CHEN Z L， ZHANG H X， et al. Object detection and tracking algorithm based on audio-visual information fusion ［J］. Journal of Applied Optics， 2021， 42（5）： 867-876.
[6]	薛俊韬，马若寒，胡超芳，等.基于MobileNet的多目标跟踪深度学习算法［J］.控制与决策，2021， 36（8）： 1991-1996.
	XUE J T， MA R H， HU C F， et al. Deep learning algorithm based on MobileNet for multi-target tracking ［J］. Control and Decision， 2021， 36（8）： 1991-1996.
[7]	涂淑琴，汤寅杰，李承桀，等.基于改进ByteTrack算法的群养生猪行为识别与跟踪技术［J］.农业机械学报，2022， 53（12）： 264-272.
	TU S Q， TANG Y J， LI C J， et al. Behavior recognition and tracking technology of group-raised pigs based on improved ByteTrack algorithm ［J］. Transactions of the Chinese Society for Agricultural Machinery， 2022， 53（12）： 264-272.
[8]	张彩煜，李明磊，魏大洲，等.基于改进FairMOT算法的热红外影像目标跟踪［J］.计算机仿真，2023， 40（12）： 304-308.
	ZHANG C Y， LI M L， WEI D Z， et al. Object tracking in thermal infrared images based on improved FairMOT algorithm ［J］. Computer Simulation， 2023， 40（12）： 304-308.
[9]	HAN K， WANG Y， TIAN Q， et al. GhostNet： more features from cheap operations ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1577-1586.
[10]	YANG L， ZHANG R Y， LI L， et al. SimAM： a simple， parameter-free attention module for convolutional neural networks ［C］// Proceedings of the 38th International Conference on Machine Learning. New York： JMLR.org， 2021： 11863-11874.
[11]	SELVARAJU R， COGSWELL M， DAS A， et al. Grad-CAM： visual explanations from deep networks via gradient-based localization ［J］. International Journal of Computer Vision， 2020， 128（2）： 336-359.
[12]	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944.
[13]	LIU S， QI L， QIN H， et al. Path aggregation network for instance segmentation ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8759-8768.
[14]	谌海云，黄忠义，王海川，等.基于改进Tracktor的行人多目标跟踪算法［J］.计算机工程与应用，2024， 60（8）： 242-249.
	SHEN H Y， HUANG Z Y， WANG H C， et al. Improved Tracktor-based pedestrian multi-objective tracking algorithm ［J］. Computer Engineering and Applications， 2024， 60（8）： 242-249.
[15]	WANG Q， WU B， ZHU P， et al. ECA-Net： efficient channel attention for deep convolutional neural networks ［C］// Proceedings of the 2020 IEEE/CVF Computer Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11531-11539.
[16]	ZHU L， WANG X， KE Z， et al. BiFormer： Vision Transformer with bi-level routing attention ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 10323-10333.
[17]	CHEN L C， PAPANDREOU G， KOKKINOS I， et al. DeepLab： semantic image segmentation with deep convolutional nets， atrous convolution， and fully connected CRFs ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（4）： 834-848.
[18]	YU Z， HUANG H， CHEN W， et al. YOLO-FaceV2： a scale and occlusion aware face detector ［J］. Pattern Recognition， 2024， 155： No.110714.
[19]	WOJKE N， BEWLEY A， PAULUS D. Simple online and realtime tracking with a deep association metric ［C］// Proceedings of the 2017 IEEE International Conference on Image Processing. Piscataway： IEEE， 2017： 3645-3649.
[20]	DU Y， ZHAO Z， SONG Y， et al. StrongSORT： make DeepSORT great again ［J］. IEEE Transactions on Multimedia， 2023， 25： 8725-8737.
[21]	AHARON N， ORFAIG R， BOBROVSKY B Z. BoT-SORT： robust associations multi-pedestrian tracking ［EB/OL］. ［2023-09-01］. .
[22]	ZHANG Y， WANG C， WANG X， et al. FairMOT： on the fairness of detection and re-identification in multiple object tracking ［J］. International Journal of Computer Vision， 2021， 129（11）： 3069-3087.
[23]	CHEN L， AI H， ZHUANG Z， et al. Real-time multiple people tracking with deeply learned candidate selection and person re-identification ［C］// Proceedings of the 2018 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2018： 1-6.
[24]	ZHANG Y， SUN P， JANG Y， et al. ByteTrack： multi-object tracking by associating every detection box ［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13682. Cham： Springer， 2022： 1-21.

参数	数值
epochs	300
batch size	16
image size	640
optimizer	SGD
momentum	0.937
weight-decay	0.000 5

参数	数值
epochs	300
batch size	16
image size	640
optimizer	SGD
momentum	0.937
weight-decay	0.000 5

空洞率参数组合	mAP/%↑	浮点运算量/GFLOPs↓
r1=1， r2=3， r3=3， r4=5	95.7	16.4
r1=3， r2=3， r3=3， r4=5	95.5	16.5
r1=3， r2=5， r3=5， r4=7	95.3	16.5

空洞率参数组合	mAP/%↑	浮点运算量/GFLOPs↓
r1=1， r2=3， r3=3， r4=5	95.7	16.4
r1=3， r2=3， r3=3， r4=5	95.5	16.5
r1=3， r2=5， r3=5， r4=7	95.3	16.5

实验	SGC3	R-Triplet	MRB	mAP/%↑	模型大小/MB↓	浮点运算量/GFLOPs↓	参数量/10⁶↓	帧率/（frame·s^-1）↑
实验1	―	―	―	93.6	13.7	16.0	7.03	51.2
实验2	√	―	―	94.8	11.6	12.6	5.89	65.3
实验3	―	√	―	94.2	13.8	16.0	7.10	50.8
实验4	―	―	√	94.4	13.8	16.2	7.11	49.1
实验5	√	√	―	96.3	12.0	12.6	5.89	63.0
实验6	√	√	√	97.4	10.1	14.2	6.75	60.5

Multi-object tracking algorithm for construction machinery in transmission line scenarios

输电线路场景下的施工机械多目标跟踪算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 27

References 24

Related Articles 15

Recommended Articles

Metrics

算法	mAP/%↑	相较于YOLOv5s变化/百分点
YOLOv5s	93.6	―
YOLOv5s+ECA	94.0	0.4
YOLOv5s+Biformer	93.7	0.1
YOLOv5s+R-Triplet	94.2	0.6

算法	mAP/%↑	相较于YOLOv5s变化/百分点
YOLOv5s	93.6	―
YOLOv5s+ASPP^［17］	94.9	1.3
YOLOv5s+Scale-Aware RFE Model^［18］	95.5	1.9
YOLOv5s+MRB	95.7	2.1

算法	mAP/%↑	参数量/10⁶↓	浮点运算量/GFLOPs↓	模型大小/MB↓
YOLOv3	92.8	61.54	155.3	117.8
YOLOv3-tiny	79.1	8.68	13.0	16.6
YOLOv5s	93.6	7.03	16.0	13.7
YOLOv7	94.4	37.21	105.2	71.3
YOLOX-s	96.5	8.92	26.5	18.1
YOLOv8s	96.2	11.13	28.4	22.5
CRM-YOLOv5s	97.4	6.75	14.2	10.1

算法	mAP↑	AP↑
算法	mAP↑	卡车	挖掘机	吊车	装载机
YOLOv5s	93.6	95.7	97.9	98.6	82.2
CRM-YOLOv5s	97.4	97.2	95.4	97.5	95.5

算法	NSA	GSI	MOTA/%	IDF1/%	IDs
算法1	―	―	84.1	86.4	32
算法2	√	―	85.7	86.6	30
算法3	―	√	85.9	87.1	27
算法4	√	√	86.9	87.6	22

算法	MOTA/%	IDF1/%	IDs
YOLOv5s+ByteTrack	84.1	86.4	32
CRM-YOLOv5s+ByteTrack	86.3	88.2	27
YOLOv5s+NG-ByteTrack	86.9	87.6	22
CRM-YOLOv5s+NG-ByteTrack	88.6	90.1	17

算法	MOTA/%	IDF1/%	IDs
DeepSORT^［19］	72.4	84.7	42
StrongSORT^［20］	73.1	82.9	40
BoTSORT^［21］	75.5	79.8	52
FairMOT^［22］	81.3	85.2	76
MOTDT^［23］	70.8	81.5	38
ByteTrack^［24］	83.2	85.6	34
本文算法	88.6	90.1	17

[1]	Binhong XIE, Yingkun LA, Yingjun ZHANG, Rui ZHANG. Semi-supervised object detection framework guided by self-paced learning [J]. Journal of Computer Applications, 2025, 45(8): 2546-2554.
[2]	Chengzhi YAN, Ying CHEN, Kai ZHONG, Han GAO. 3D object detection algorithm based on multi-scale network and axial attention [J]. Journal of Computer Applications, 2025, 45(8): 2537-2545.
[3]	Liang CHEN, Xuan WANG, Kun LEI. Helmet wearing detection algorithm for complex scenarios based on cross-layer multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(7): 2333-2341.
[4]	Yingjun ZHANG, Weiwei YAN, Binhong XIE, Rui ZHANG, Wangdong LU. Gradient-discriminative and feature norm-driven open-world object detection [J]. Journal of Computer Applications, 2025, 45(7): 2203-2210.
[5]	Peiyu JIANG, Yongguang WANG, Yating REN, Shuochen LI, Huobin TAN. Object detection uncertainty measurement scheme based on guide to the expression of uncertainty in measurement [J]. Journal of Computer Applications, 2025, 45(7): 2162-2168.
[6]	Zimo ZHANG, Xuezhuan ZHAO. Multi-scale sparse graph guided vision graph neural networks [J]. Journal of Computer Applications, 2025, 45(7): 2188-2194.
[7]	Qingqing ZHAO, Bin HU. Moving pedestrian detection neural network with invariant global sparse contour point representation [J]. Journal of Computer Applications, 2025, 45(4): 1271-1284.
[8]	Liwei ZHANG, Quan LIANG, Yutao HU, Qiaole ZHU. Channel shuffle attention mechanism based on group convolution [J]. Journal of Computer Applications, 2025, 45(4): 1069-1076.
[9]	Yang HOU, Qiong ZHANG, Zixuan ZHAO, Zhengyu ZHU, Xiaobo ZHANG. YOLOv5s-MRD： efficient fire and smoke detection algorithm for complex scenarios based on YOLOv5s [J]. Journal of Computer Applications, 2025, 45(4): 1317-1324.
[10]	Chuanhao ZHANG, Xiaohan TU, Xuehui GU, Bo XUAN. LiDAR-camera 3D object detection based on multi-modal information mutual guidance and supplementation [J]. Journal of Computer Applications, 2025, 45(3): 946-952.
[11]	Songsen YU, Zhifan LIN, Guopeng XUE, Jianyu XU. Lightweight large-format tile defect detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2025, 45(2): 647-654.
[12]	Sheng YANG, Yan LI. Contrastive knowledge distillation method for object detection [J]. Journal of Computer Applications, 2025, 45(2): 354-361.
[13]	Jiayang GUI, Shunji WANG, Zhengkang ZHOU, Jiashan TANG. Tunnel foreign object detection algorithm based on improved YOLOv8n [J]. Journal of Computer Applications, 2025, 45(2): 655-661.
[14]	Shijia WEN, Shijun JING. Dynamic visual SLAM algorithm incorporating object detection and feature point association [J]. Journal of Computer Applications, 2025, 45(2): 610-615.
[15]	Zhongwei ZHANG, Jun WANG, Shudong LIU, Zhiheng WANG. Object detection in remote sensing image based on multi-scale feature fusion and weighted boxes fusion [J]. Journal of Computer Applications, 2025, 45(2): 633-639.

算法	mAP@0.5/%	mAP@0.5：0.95/%	帧率/（frame·s^-1）
YOLOv5s	92.0	60.2	28.4
CRM-YOLOv5s	96.9	63.4	35.9

算法	mAP@0.5/%	mAP@0.5：0.95/%	帧率/（frame·s^-1）
YOLOv5s	92.0	60.2	28.4
CRM-YOLOv5s	96.9	63.4	35.9