《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (8): 2423-2431.DOI: 10.11772/j.issn.1001-9081.2021060984

• 人工智能 • 上一篇    

基于改进YOLOv3的多尺度目标检测算法

张丽莹1, 庞春江1, 王新颖1(), 李国亮2   

  1. 1.华北电力大学(保定) 计算机系,河北 保定 071003
    2.国网山东省电力公司 枣庄供电公司,山东 枣庄 277800
  • 收稿日期:2021-06-10 修回日期:2021-09-28 接受日期:2021-10-12 发布日期:2021-12-27 出版日期:2022-08-10
  • 通讯作者: 王新颖
  • 作者简介:张丽莹(1996—),女,河北保定人,硕士研究生,主要研究方向:图像处理、深度学习;
    庞春江(1965—),男,河北保定人,副教授,硕士,主要研究方向:图形图像处理、深度学习;
    王新颖(1984—),男,河北保定人,讲师,硕士,主要研究方向:人工智能;
    李国亮(1981—),男,河北安新人,讲师,硕士,主要研究方向:计算机视觉、知识图谱。
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(2021MS090);国网山东省电力公司枣庄供电公司科技项目(SD20-GC-ZB003-SGZH-KJ)

Multi-scale object detection algorithm based on improved YOLOv3

Liying ZHANG1, Chunjiang PANG1, Xinying WANG1(), Guoliang LI2   

  1. 1.Department of Computer,North China Electric Power University (Baoding),Baoding Hebei 071003,China
    2.Zaozhuang Power Supply Company,State Grid Shandong Electric Power Company,Zaozhuang Shandong 277800,China
  • Received:2021-06-10 Revised:2021-09-28 Accepted:2021-10-12 Online:2021-12-27 Published:2022-08-10
  • Contact: Xinying WANG
  • About author:ZHANG Liying, born in 1996, M. S. candidate. Her research interests include image processing, deep learning.
    PANG Chunjiang, born in 1965, M. S., associate professor. His research interests include graphics and image processing, deep learning.
    WANG Xinying, born in 1984, M. S., lecturer. His research interests include artificial intelligence.
    LI Guoliang, born in 1981, M. S., lecturer. His research interests include computer vision, knowledge graph.
  • Supported by:
    Fundamental Research Funds for Central University(2021MS090);Science and Technology Project of Zaozhuang Power Supply Company of State Grid Shandong Electric Power Company(SD20-GC-ZB003-SGZH-KJ)

摘要:

为了进一步提高多尺度目标检测的速度和精度,解决小目标检测易造成的漏检、错检以及重复检测等问题,提出一种基于改进YOLOv3的目标检测算法实现多尺度目标的自动检测。首先,在特征提取网络中对网络结构进行改进,在残差模块的空间维度中引入注意力机制,对小目标进行关注;然后,利用密集连接网络(DenseNet)充分融合网络浅层信息,并用深度可分离卷积替换主干网络中的普通卷积,减少模型的参数量,提升检测速率。在特征融合网络中,通过双向金字塔结构实现深浅层特征的双向融合,并将3尺度预测变为4尺度预测,提高了多尺度特征的学习能力;在损失函数方面,选取GIoU(Generalized Intersection over Union)作为损失函数,提高目标识别的精度,降低目标漏检率。实验结果表明,基于改进YOLOv3(You Only Look Once v3)的目标检测算法在Pascal VOC测试集上的平均准确率均值(mAP)达到83.26%,与原YOLOv3算法相比提升了5.89个百分点,检测速度达22.0 frame/s;在COCO数据集上,与原YOLOv3算法相比,基于改进YOLOv3的目标检测算法在mAP上提升了3.28个百分点;同时,在进行多尺度的目标检测中,算法的mAP有所提升,验证了基于改进YOLOv3的目标检测算法的有效性。

关键词: 目标检测, YOLOv3, 多尺度目标, 双向特征金字塔, 注意力机制

Abstract:

In order to further improve the speed and precision of multi-scale object detection, and to solve the situations such as miss detection, wrong detection and repeated detection caused by small object detection, an object detection algorithm based on improved You Only Look Once v3 (YOLOv3) was proposed to realize automatic detection of multi-scale object. Firstly, the network structure was improved in the feature extraction network, and the attention mechanism was introduced into the spatial dimensions of residual module to pay attention to small objects. Then, Dense Convulutional Network (DenseNet) was used to fully integrate shallow information of the network, and the depthwise separable convolution was used to replace the normal convolution of the backbone network, thereby reducing the number of model parameters and improving the detection speed. In the feature fusion network, the bidirectional fusion of the shallow and deep features was realized through the bidirectional feature pyramid structure, and the 3-scale prediction was changed to 4-scale prediction, which improved the learning ability of multi-scale features. In terms of loss function, Generalized Intersection over Union (GIoU) was selected as the loss function, so that the precision of identifying objects was increased, and the object miss rate was reduced. Experimental results show that on Pascal VOC datasets, the mean Average Precision (mAP) of the improved YOLOv3 algorithm is as high as 83.26%, which is 5.89 percentage points higher than that of the original YOLOv3 algorithm, and the detection speed of the improved algorithm reaches 22.0 frame/s. Compared with the original YOLOv3 algorithm on Common Objects in COntext (COCO) dataset, the improved algorithm has the mAP improved by 3.28 percentage points. At the same time, in multi-scale object detection, the mAP of the algorithm has been improved, which verifies the effectiveness of the object detection algorithm based on the improved YOLOv3.

Key words: object detection, YOLOv3 (You Only Look Once v3), multi-scale object, bidirectional feature pyramid, attention mechanism

中图分类号: