Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3234-3241.DOI: 10.11772/j.issn.1001-9081.2021010026

Special Issue: 人工智能

• Artificial intelligence • Previous Articles     Next Articles

Single shot multibox detector recognition method for aerial targets of unmanned aerial vehicle

Huaiyu ZHU1, Bo LI2()   

  1. 1.School of Mechanical and Electrical Engineering,University of Electronic Science and Technology of China,Chengdu Sichuan 611731,China
    2.College of Mechanical and Electrical Engineering,University of Electronic Science and Technology of China,Zhongshan Institute,Zhongshan Guangdong 528400,China
  • Received:2021-01-07 Revised:2021-02-03 Accepted:2021-03-23 Online:2021-04-15 Published:2021-11-10
  • Contact: Bo LI
  • About author:ZHU Huaiyu, born in 1995, M. S. candidate. His research interests include machine vision,artificial intelligence
    LI Bo,born in 1977,Ph. D.,associate professor. His research interests include machine vision inspection,industrial automation.


朱槐雨1, 李博2()   

  1. 1.电子科技大学 机械与电气工程学院,成都 611731
    2.电子科技大学中山学院 机电工程学院,广东 中山 528400
  • 通讯作者: 李博
  • 作者简介:朱槐雨(1995—),男,四川自贡人,硕士研究生,主要研究方向:机器视觉、人工智能
    李博(1977—),男,广东茂名人,副教授,硕 士,主要研究方向:机器视觉检测、工业自动化。


Unmanned Aerial Vehicle (UAV) aerial images have a wide field of vision, and the targets in the images are small and have blurred boundaries. And the existing Single Shot multibox Detector (SSD) target detection model is difficult to accurately detect small targets in aerial images. In order to effectively solve the problem that the original model is easy to have missed detection, based on Feature Pyramid Network (FPN), a new SSD model based on continuous upsampling was proposed. In the improved SSD model, the input image size was adjusted to 320×320, the Conv3_3 feature layer was added, the high-level features were upsampled, and features of the first five layers of VGG16 network were fused by using feature pyramid structure, so as to enhance the semantic representation ability of each feature layer. Meanwhile, the size of anchor box was redesigned. Training and verification were carried out on the open aerial dataset UCAS-AOD. Experimental results show that, the improved SSD model has 94.78% in mean Average Precision (mAP) of different categories, and compared with the existing SSD model, the improved SSD model has the accuracy increased by 17.62%, including 4.66% for plane category and 34.78% for car category.

Key words: aerial image, Convolution Neural Network (CNN), target detection, Single Shot multibox Detector (SSD), feature fusion



关键词: 航拍图像, 卷积神经网络, 目标检测, 单阶段多框检测器, 特征融合

CLC Number: