Journal of Computer Applications

    Next Articles

Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer

  

  • Received:2023-12-26 Revised:2024-03-15 Online:2024-04-10 Published:2024-04-10

基于改进实时检测 Transformer的塔机上俯视场景小目标检测模型

庞玉东,李志星,刘伟杰,李天昊   

  1. 北京建筑大学机电与车辆工程学院
  • 通讯作者: 庞玉东
  • 基金资助:
    复杂机电装备图域机械故障诊断方法研究;约束势随机共振的行星齿轮箱早期故障诊断关键技术方法研究

Abstract: In view of a series of problems of safety and security of construction site personnel such as falling objects caused by mutual collision of tower hooks and casualties caused by tower collapse,a small target detection model in overlooking scenes on tower cranes based on improved Real-Time DEtection Transformer(RT-DETR) was proposed.First, the multiple training and single inference structures designed by applying the idea of model reparameterization were added to the original model to improve the detection speed; then, the convolution module in FastNet-Block was redesigned to replace the BasicBlock in the original BackBone to improve the performance of the detection model; and then, the new loss function Inner-SIoU was utilized to further improve the performance and convergence speed of the model. model performance and convergence speed. Finally, the ablation and comparison experiments were conducted to verify the model performance, and the results show that the small target detection model for overlooking scene on tower based on improved RT-DETR achieves an precision of 94.7% in detecting the top-view small target image of the tower crane, which is higher than that of the original RT-DETR model by 6.1 percentage points. The frames per second (FPS) of detection reaches 59.7, and the detection speed was improved by 21 percentage points compared with the original model. The AP index on the public data set COCO 2017 were 2.4, 1.5, and 1.3 percentage points higher than YOLOv5, YOLOv7, and YOLOv8 respectively , meeting the requirements for small target detection accuracy and speed in the overlooking scene on the tower crane.

Key words: object detection, RT-DETR(Real-Time DEtection TRansformer), small target, Transformer, computer vision, attention mechanism

摘要: 摘 要: 针对塔机吊钩相互碰撞导致物体跌落以及塔机倒塌致使人员伤亡等一系列施工现场人员安全保障的问题,提出一种基于改进实时检测Transformer (Real-Time DEtection Transformer,RT-DETR)的塔机上俯视场景小目标检测模型。首先,在原始模型中加入应用模型重参数化思想设计的多路训练和单路推理结构以提升检测速度;其次,重新设计FastNet-Block中的卷积模块,替换原始BackBone之中的BasicBlock提升检测模型性能;再次,利用新的损失函数Inner-SIoU进一步提升模型性能与收敛速度。最后,进行消融实验与对比试验验证模型性能,结果表明基于改进RT-DETR的塔机上俯视场景小目标检测模型在检测塔机顶部俯视小目标图像上精度(Precision)达到94.7%,高于原始RT-DETR模型6.1个百分点。每秒检测帧数(FPS)达到59.7,检测速度较原模型提升21了个百分点。在公共数据集COCO 2017上的AP指标相较YOLOv5、YOLOv7、YOLOv8分别高出2.4、1.5、1.3个百分点,满足塔机上俯视场景下小目标检测精度和速度的要求。

关键词: 目标检测, RT-DETR, 小目标, Transformer, 计算机视觉, 注意力机制

CLC Number: