计算机应用 ›› 2020, Vol. 40 ›› Issue (12): 3526-3533.DOI: 10.11772/j.issn.1001-9081.2020050641

• 人工智能 • 上一篇    下一篇

基于非对称沙漏网络结构的目标检测算法

刘子威1,2,3, 邓春华1,2,3, 刘静1,2,3   

  1. 1. 武汉科技大学 计算机科学与技术学院, 武汉 430065;
    2. 武汉科技大学 大数据科学与工程研究院, 武汉 430065;
    3. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065
  • 收稿日期:2020-05-15 修回日期:2020-07-20 出版日期:2020-12-10 发布日期:2020-08-14
  • 通讯作者: 刘静(1984-),女,湖北孝感人,讲师,博士,主要研究方向:调度、容错、实时系统、边缘计算。luijing_cs@wust.edu.cn
  • 作者简介:刘子威(1996-),男,湖北孝感人,硕士研究生,主要研究方向:计算机视觉、机器学习;邓春华(1984-),男,湖南郴州人,副教授,博士,主要研究方向:计算机视觉、机器学习
  • 基金资助:
    国家自然科学基金资助项目(61806150);湖北省科技厅计划项目(2018CFB195);湖北省教育厅科学技术研究计划青年人才项目(Q20181104);智能信息处理与实时工业系统湖北省重点实验室开放基金资助项目(znxx2018QN09);武汉科技大学国防预研基金资助项目(GF201814)。

Object detection algorithm based on asymmetric hourglass network structure

LIU Ziwei1,2,3, DENG Chunhua1,2,3, LIU Jing1,2,3   

  1. 1. School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    2. Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    3. Hubei Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China
  • Received:2020-05-15 Revised:2020-07-20 Online:2020-12-10 Published:2020-08-14
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61806150), the Hubei Provincial Department of Science and Technology Program (2018CFB195), the Hubei Provincial Department of Education Science and Technology Research Program Young Talent Project (Q20181104), the Open Foundation of Hubei Key Laboratory of Intelligent Information Processing and Real-time Industrial System (znxx2018QN09), the National Defense Advanced Research Foundation of Wuhan University of Science and Technology (GF201814).

摘要: 基于无锚框深度学习的目标检测是一种主流的单阶段目标检测算法。融合多层监督信息的沙漏网络结构能够显著提升无锚框目标检测算法的精度,然而其速度却远低于同层次的普通网络的速度,并且不同尺度目标间的特征会互相干扰。针对上述问题,提出了一种非对称沙漏网络结构的目标检测算法。该算法在融合不同网络层的特征时不受形状大小的约束,能够快速高效抽象出网络的语义信息,使模型更容易学习到各种尺度之间的差异。针对不同尺度目标检测问题,设计了一种多尺度输出的沙漏网络结构用来解决不同尺度目标间特征互相干扰的问题,并精细化输出的检测结果。另外,针对多尺度输出使用了一种特殊的非极大值抑制算法以提高检测算法的召回率。实验结果表明,所提算法在COCO数据集上的AP50指标达到61.3%,相较于无锚框网络CenterNet提升了4.2个百分点。所提算法在精度与时间的平衡上超越了原始算法,尤其适用于对工业场景的目标进行实时检测。

关键词: 深度学习, 机器视觉, 卷积神经网络, 单阶段目标检测, 锚框, 沙漏网络

Abstract: Anchor-free deep learning based object detection is a mainstream single-stage object detection algorithm. An hourglass network structure that incorporates multiple layers of supervisory information can significantly improve the accuracy of the anchor-free object detection algorithm, but its speed is much lower than that of a common network at the same level, and the features of different scale objects will interfere with each other. In order to solve the above problems, an object detection algorithm based on asymmetric hourglass network structure was proposed. The proposed algorithm is not constrained by the shape and size when fusing the features of different network layers, and can quickly and efficiently abstract the semantic information of network, making it easier for the model to learn the differences between various scales. Aiming at the problem of object detection at different scales, a multi-scale output hourglass network structure was designed to solve the problem of feature mutual interference between different scale objects and refine the output detection results. In addition, a special non-maximum suppression algorithm for multi-scale outputs was used to improve the recall rate of the detection algorithm. Experimental results show that the AP50 index of the proposed algorithm on Common Objects in COntext (COCO) dataset reaches 61.3%, which is 4.2 percentage points higher than that of anchor-free network CenterNet. The proposed algorithm surpasses the original algorithm in the balance of accuracy and time, and is particularly suitable for real-time object detection in industry.

Key words: deep learning, computer vision, convolutional neural network, single-stage object detection, anchor, hourglass network

中图分类号: