Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3234-3241.DOI: 10.11772/j.issn.1001-9081.2021010026

• Artificial intelligence • Previous Articles     Next Articles

Single shot multibox detector recognition method for aerial targets of unmanned aerial vehicle

Huaiyu ZHU1, Bo LI2()   

  1. 1.School of Mechanical and Electrical Engineering,University of Electronic Science and Technology of China,Chengdu Sichuan 611731,China
    2.College of Mechanical and Electrical Engineering,University of Electronic Science and Technology of China,Zhongshan Institute,Zhongshan Guangdong 528400,China
  • Received:2021-01-07 Revised:2021-02-03 Accepted:2021-03-23 Online:2021-04-15 Published:2021-11-10
  • Contact: Bo LI
  • About author:ZHU Huaiyu, born in 1995, M. S. candidate. His research interests include machine vision,artificial intelligence
    LI Bo,born in 1977,Ph. D.,associate professor. His research interests include machine vision inspection,industrial automation.

单阶段多框检测器无人机航拍目标识别方法

朱槐雨1, 李博2()   

  1. 1.电子科技大学 机械与电气工程学院,成都 611731
    2.电子科技大学中山学院 机电工程学院,广东 中山 528400
  • 通讯作者: 李博
  • 作者简介:朱槐雨(1995—),男,四川自贡人,硕士研究生,主要研究方向:机器视觉、人工智能
    李博(1977—),男,广东茂名人,副教授,硕 士,主要研究方向:机器视觉检测、工业自动化。

Abstract:

Unmanned Aerial Vehicle (UAV) aerial images have a wide field of vision, and the targets in the images are small and have blurred boundaries. And the existing Single Shot multibox Detector (SSD) target detection model is difficult to accurately detect small targets in aerial images. In order to effectively solve the problem that the original model is easy to have missed detection, based on Feature Pyramid Network (FPN), a new SSD model based on continuous upsampling was proposed. In the improved SSD model, the input image size was adjusted to 320×320, the Conv3_3 feature layer was added, the high-level features were upsampled, and features of the first five layers of VGG16 network were fused by using feature pyramid structure, so as to enhance the semantic representation ability of each feature layer. Meanwhile, the size of anchor box was redesigned. Training and verification were carried out on the open aerial dataset UCAS-AOD. Experimental results show that, the improved SSD model has 94.78% in mean Average Precision (mAP) of different categories, and compared with the existing SSD model, the improved SSD model has the accuracy increased by 17.62%, including 4.66% for plane category and 34.78% for car category.

Key words: aerial image, Convolution Neural Network (CNN), target detection, Single Shot multibox Detector (SSD), feature fusion

摘要:

无人机(UAV)航拍图像视野开阔,图像中的目标较小且边缘模糊,而现有单阶段多框检测器(SSD)目标检测模型难以准确地检测航拍图像中的小目标。为了有效地解决原有模型容易漏检的问题,借鉴特征金字塔网络(FPN)提出了一种基于连续上采样的SSD模型。改进SSD模型将输入图像尺寸调整为320×320,新增Conv3_3特征层,将高层特征进行上采样,并利用特征金字塔结构对VGG16网络前5层特征进行融合,从而增强各个特征层的语义表达能力,同时重新设计先验框的尺寸。在公开航拍数据集UCAS-AOD上训练并验证,实验结果表明,所提改进SSD模型的各类平均精度均值(mAP)达到了94.78%,与现有SSD模型相比,其准确率提升了17.62%,其中飞机类别提升了4.66%,汽车类别提升了34.78%。

关键词: 航拍图像, 卷积神经网络, 目标检测, 单阶段多框检测器, 特征融合

CLC Number: