《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2619-2629.DOI: 10.11772/j.issn.1001-9081.2022081207

• 多媒体计算与计算机仿真 • 上一篇    

基于改进的YOLOv5的大坝表面病害检测算法

段升位1, 程欣宇1(), 王浩舟1, 王飞2   

  1. 1.公共大数据国家重点实验室(贵州大学),贵阳 550025
    2.中国电建集团贵阳勘测设计研究院有限公司,贵阳 550081
  • 收稿日期:2022-09-01 修回日期:2022-11-07 接受日期:2022-11-14 发布日期:2023-01-11 出版日期:2023-08-10
  • 通讯作者: 程欣宇
  • 作者简介:段升位(1996—),男,四川攀枝花人,硕士研究生,主要研究方向:计算机视觉、目标检测
    王浩舟(1994—),男,贵州贵阳人,硕士研究生,主要研究方向:机器视觉
    王飞(1982—),男,贵州贵阳人,工程师,主要研究方向:工程安全监测软件、自动化控制系统。
  • 基金资助:
    贵州省水利厅科研项目(KT202010)

Dam surface disease detection algorithm based on improved YOLOv5

Shengwei DUAN1, Xinyu CHENG1(), Haozhou WANG1, Fei WANG2   

  1. 1.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    2.Power China Guiyang Engineering Corporation Limited,Guiyang Guizhou 550081,China
  • Received:2022-09-01 Revised:2022-11-07 Accepted:2022-11-14 Online:2023-01-11 Published:2023-08-10
  • Contact: Xinyu CHENG
  • About author:DUAN Shengwei, born in 1996, M. S. candidate. His research interests include computer vision, object detection.
    WANG Haozhou, born in 1994, M. S. candidate. His research interests include computer vision.
    WANG Fei, born in 1982, engineer. His research interests include engineering safety monitoring software, automation control system.
  • Supported by:
    Research Project of Water Resource Department of Guizhou Province(KT202010)

摘要:

针对当前水利大坝主要依靠人工现场巡视,运营成本高且效率低的问题,提出一种基于YOLOv5的改进检测算法。首先,采用改进的多尺度的视觉Transformer结构改进主干网络,并利用多尺度Transformer结构关联的多尺度全局信息和卷积神经网络(CNN)提取的局部信息来构建聚合特征,从而充分利用多尺度的语义信息和位置信息来提高网络的特征提取能力。然后,在网络的每个特征检测层前加入同位注意力机制,以在图像的高度和宽度方向分别进行特征编码,再用编码后的特征构建特征图上像素的长距离关联,从而增强网络在复杂环境中的目标定位能力。接着,改进了网络正负训练样本的采样算法,通过构建先验框与真实框的平均契合度和差异度筛选样本来辅助候选正样本与自身形状相近的先验框产生响应,以帮助网络更快、更好地收敛,从而提升网络的整体性能和网络泛化性。最后,针对应用需求对网络进行了轻量化,并通过对网络结构剪枝和结构重参数化优化网络结构。实验结果表明:在当前采用的大坝病害数据上,对比原始YOLOv5s算法,改进后的网络mAP@0.5提升了10.5个百分点,mAP@0.5:0.95提高了17.3个百分点;轻量化后的网络对比轻量化之前的网络的参数量和计算量分别降低了24%和13%,检测速度提升了42%,满足当前应用场景下病害检测精度和速度的要求。

关键词: 目标检测, 工程缺陷, YOLOv5, 多尺度视觉Transformer, 同位注意力机制, 大坝病害

Abstract:

For the current water conservancy dams mainly rely on manual on-site inspections, which have high operating costs and low efficiency, an improved detection algorithm based on YOLOv5 was proposed. Firstly, a modified multi-scale visual Transformer structure was used to improve the backbone, and the multi-scale global information associated with the multi-scale Transformer structure and the local information extracted by Convolutional Neural Network (CNN) were used to construct the aggregated features, thereby making full use of the multi-scale semantic information and location information to improve the feature extraction capability of the network. Then, coordinate attention mechanism was added in front of each feature detection layer of the network to encode features in the height and width directions of the image, and long-distance associations of pixels on the feature map were constructed by the encoded features to enhance the target localization ability of the network in complex environments. The sampling algorithm of the positive and negative training samples of the network was improved to help the candidate positive samples to respond to the prior frames of similar shape to themselves by constructing the average fit and difference between the prior frames and the ground-truth frames, so as to make the network converge faster and better, thus improving the overall performance of the network and the network generalization. Finally, the network structure was lightened for application requirements and was optimized by pruning the network structure and structural re-parameterization. Experimental results show that on the current adopted dam disease data, compared with the original YOLOv5s algorithm, the improved network has the mAP (mean Average Precision)@0.5 improved by 10.5 percentage points, the mAP@0.5:0.95 improved by 17.3 percentage points; compared to the network before lightening, the lightweight network has the number of parameters and the FLOPs(FLoating point Operations Per second) reduced by 24% and 13% respectively, and the detection speed improved by 42%, verifying that the network meets the requirements for precision and speed of disease detection in current application scenarios.

Key words: object detection, engineering defect, YOLOv5, multi-scale visual Transformer, coordinate attention mechanism, dam disease

中图分类号: