单阶段多框检测器无人机航拍目标识别方法

doi:10.11772/j.issn.1001-9081.2021010026

《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (11): 3234-3241.DOI: 10.11772/j.issn.1001-9081.2021010026

所属专题：人工智能

单阶段多框检测器无人机航拍目标识别方法

朱槐雨¹, 李博²()

^1.电子科技大学机械与电气工程学院，成都 611731
^2.电子科技大学中山学院机电工程学院，广东中山 528400

收稿日期:2021-01-07 修回日期:2021-02-03 接受日期:2021-03-23 发布日期:2021-04-15 出版日期:2021-11-10
通讯作者: 李博
作者简介:朱槐雨（1995—），男，四川自贡人，硕士研究生，主要研究方向：机器视觉、人工智能
李博（1977—），男，广东茂名人，副教授，硕士，主要研究方向：机器视觉检测、工业自动化。

Single shot multibox detector recognition method for aerial targets of unmanned aerial vehicle

Huaiyu ZHU¹, Bo LI²()

^1.School of Mechanical and Electrical Engineering，University of Electronic Science and Technology of China，Chengdu Sichuan 611731，China
^2.College of Mechanical and Electrical Engineering，University of Electronic Science and Technology of China，Zhongshan Institute，Zhongshan Guangdong 528400，China

Received:2021-01-07 Revised:2021-02-03 Accepted:2021-03-23 Online:2021-04-15 Published:2021-11-10
Contact: Bo LI
About author:ZHU Huaiyu， born in 1995， M. S. candidate. His research interests include machine vision，artificial intelligence
LI Bo，born in 1977，Ph. D.，associate professor. His research interests include machine vision inspection，industrial automation.

摘要/Abstract

摘要：

无人机（UAV）航拍图像视野开阔，图像中的目标较小且边缘模糊，而现有单阶段多框检测器（SSD）目标检测模型难以准确地检测航拍图像中的小目标。为了有效地解决原有模型容易漏检的问题，借鉴特征金字塔网络（FPN）提出了一种基于连续上采样的SSD模型。改进SSD模型将输入图像尺寸调整为 $320 × 320$ ，新增Conv3_3特征层，将高层特征进行上采样，并利用特征金字塔结构对VGG16网络前5层特征进行融合，从而增强各个特征层的语义表达能力，同时重新设计先验框的尺寸。在公开航拍数据集UCAS-AOD上训练并验证，实验结果表明，所提改进SSD模型的各类平均精度均值（mAP）达到了94.78%，与现有SSD模型相比，其准确率提升了17.62%，其中飞机类别提升了4.66%，汽车类别提升了34.78%。

关键词: 航拍图像, 卷积神经网络, 目标检测, 单阶段多框检测器, 特征融合

Abstract:

Unmanned Aerial Vehicle （UAV） aerial images have a wide field of vision， and the targets in the images are small and have blurred boundaries. And the existing Single Shot multibox Detector （SSD） target detection model is difficult to accurately detect small targets in aerial images. In order to effectively solve the problem that the original model is easy to have missed detection， based on Feature Pyramid Network （FPN）， a new SSD model based on continuous upsampling was proposed. In the improved SSD model， the input image size was adjusted to $320 × 320$ ， the Conv3_3 feature layer was added， the high-level features were upsampled， and features of the first five layers of VGG16 network were fused by using feature pyramid structure， so as to enhance the semantic representation ability of each feature layer. Meanwhile， the size of anchor box was redesigned. Training and verification were carried out on the open aerial dataset UCAS-AOD. Experimental results show that， the improved SSD model has 94.78% in mean Average Precision （mAP） of different categories， and compared with the existing SSD model， the improved SSD model has the accuracy increased by 17.62%， including 4.66% for plane category and 34.78% for car category.

Key words: aerial image, Convolution Neural Network (CNN), target detection, Single Shot multibox Detector (SSD), feature fusion

中图分类号:

TP183

朱槐雨, 李博. 单阶段多框检测器无人机航拍目标识别方法[J]. 计算机应用, 2021, 41(11): 3234-3241.

Huaiyu ZHU, Bo LI. Single shot multibox detector recognition method for aerial targets of unmanned aerial vehicle[J]. Journal of Computer Applications, 2021, 41(11): 3234-3241.

图/表 18

表1 不同目标检测模型在PASCAL VOC2007数据集上的mAP与帧率对比

Tab. 1 Comparison of mAP and frame rate of different target detection models on PASCAL VOC2007 dataset

目标检测模型	mAP/%	帧率/（frame·s^-1）
R-CNN	66	0.02
Fast R-CNN	70	0.4
Faster R-CNN	73	7
YOLO	66	21
SSD300	77	46
SSD512	80	19

图1 SSD模型结构

Fig. 1 SSD model structure

图2 交并比计算

Fig. 2 IoU calculation

图3 CU-SSD模型结构

Fig. 3 CU-SSD model structure

图4 特征层上采样结果

Fig. 4 Upsampling results on feature layer

图5 特征融合方式

Fig. 5 Feature fusion methods

图6 特征融合模块

Fig. 6 Feature fusion module

图7 Conv3_3层热力图

Fig. 7 Heat map of Conv3_3 layer

表2 先验框尺寸

Tab. 2 Size of anchor box

特征层	Conv3_3		Conv4_3		fc7		Conv6_2		Conv7_2		Conv8_2		Conv9_2
特征层	Min_size	Max_size	Min_size	Max_size	Min_size	Max_size	Min_size	Max_size	Min_size	Max_size	Min_size	Max_size	Min_size	Max_size
SSD	—	—	30	60	60	111	111	162	152	213	213	264	264	315
CU-SSD	16	32	32	64	64	118	118	173	162	227	227	282	282	336

图8 不同特征层先验框比较

Fig. 8 Comparison of anchor boxes of different feature layers

表3 数据集组成

Tab. 3 Dataset composition

样本	图像数	样本数
总数	1 510	14 596
飞机	1 000	7 482
汽车	510	7 114

图9 训练过程中的loss曲线

Fig. 9 Loss curves during training process

表4 不同模型的性能结果

Tab. 4 Performance results of different models

模型	AP/%		mAP/%	帧率/（frame·s^-1）	Size/MB
模型	car	plane	mAP/%	帧率/（frame·s^-1）	Size/MB
SSD	69.33	91.84	80.58	13.6	91
FSSD	62.04	90.24	76.14	13.0	105
RFBNet	73.80	93.88	83.84	9.9	142
YOLOv3	86.07	94.06	90.07	12.0	235
CU-SSD	93.44	96.12	94.78	9.0	79

表5 不同特征融合层数的实验结果

Tab. 5 Experimental results of different feature fusion layers

融合层数	AP/%		mAP/%
融合层数	car	plane	mAP/%
0	89.49	94.34	91.92
2	90.04	94.84	92.44
3	90.85	95.06	92.96
4	92.44	96.24	94.34
5	93.44	96.12	94.78
6	93.12	95.90	94.51

图10 car和plane类别的P-R曲线

Fig. 10 P-R curves of car and plane categories

图11 SSD和CU-SSD对车辆类别的检测效果

Fig. 11 Detection effects of SSD and CU-SSD on car category

图12 SSD和CU-SSD对飞机类别的检测效果

Fig. 12 Detection effects of SSD and CU-SSD on plane category

表6 不同改进模块的性能对比

Tab. 6 Performance comparison of different improved modules

组别	改进模块			AP/%		mAP/%	帧率/（frame·s^-1）
组别	Conv3_3	Fusion	anchors	car	plane	mAP/%	帧率/（frame·s^-1）
1	×	×	×	69.33	91.84	80.58	13.6
2	×	√	×	63.78	90.51	77.15	12.2
3	×	×	√	85.18	94.15	89.66	9.6
4	×	√	√	89.19	94.28	91.74	9.6
5	√	×	√	89.49	94.34	91.92	9.2
6	√	√	√	93.44	96.12	94.78	9.0

参考文献 25

1	HU S， LEE G H. Image-based geo-localization using satellite imagery ［J］. International Journal of Computer Vision， 2020， 128（5）： 1205-1219. 10.1007/s11263-019-01186-0
2	YANG S， CHENG H， LI T， et al. UAV reconnaissance images targeting method ［C］// Proceeding of the 2016 8th International Conference on Digital Image Processing. Bellingham： SPIE， 2016： Article No.100333X. 10.1117/12.2244925
3	WANG B， GU Y. An improved FBPN-based detection network for vehicles in aerial images ［J］. Sensors， 2020， 20（17）： Article No.4709. 10.3390/s20174709
4	XIA Y， YE G X， YAN S S， et al. Application research of fast UAV aerial photography object detection and recognition based on improved YOLOv3 ［J］. Journal of Physics： Conference Series， 2020， 1550： Article No.032075. 10.1088/1742-6596/1550/3/032075
5	QIN Z W， YU F X， LIU C C， et al. How convolutional neural networks see the world — a survey of convolutional neural network visualization methods ［J］. Mathematical Foundations of Computing， 2018， 1（2）： 149-180. 10.3934/mfc.2018008
6	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation ［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587. 10.1109/cvpr.2014.81
7	GIRSHICK R. Fast R-CNN ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448. 10.1109/iccv.2015.169
8	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）：1137-1149. 10.1109/tpami.2016.2577031
9	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
10	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS9905. Cham： Springer， 2016： 21-37.
11	LI X L， LI X W， GUAN S J， et al. Trident SSD： a trident single-shot multibox object detector with deconvolution ［J］. Journal of Physics： Conference Series， 2020， 1631： Article No.012182. 10.1088/1742-6596/1631/1/012182
12	CAO J W， SONG C X， SONG S X， et al. Front vehicle detection algorithm for smart car based on improved SSD model ［J］. Sensors， 2020， 20（16）： Article No.4646. 10.3390/s20164646
13	LI Y D， DONG H， LI H G， et al. Multi-block SSD based on small object detection for UAV railway scene surveillance ［J］. Chinese Journal of Aeronautics， 2020， 33（6）： 1747-1755. 10.1016/j.cja.2020.02.024
14	HOU Z Q， LIU X Y， CHEN L L. Object detection algorithm for improving non-Maximum suppression using GIoU ［J］. IOP Conference Series： Materials Science and Engineering， 2020， 790： Article No.012062. 10.1088/1757-899x/790/1/012062
15	ZHU H T， GU C Y. Target detection algorithm introducing attention mechanism： attention_SSD ［J］. International Core Journal of Engineering， 2020， 6（7）： 267-275.
16	LIANG Y J， LI H H， GUO B， et al. Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification ［J］. Information Sciences， 2021， 548： 295-312. 10.1016/j.ins.2020.10.021
17	姚桐，于雪媛，王越，等.改进SSD无人机航拍小目标识别［J］. 舰船电子工程，2020，40（9）：162-166. 10.3969/j.issn.1672-9730.2020.09.039
	YAO T， YU X H， WANG Y， et al. Improvement of small target recognition algorithm of aerial photography images based on SSD ［J］. Ship Electronic Engineering， 2020， 40（9）： 162-166. 10.3969/j.issn.1672-9730.2020.09.039
18	FU C Y， LIU W， RANGA A， et al. DSSD： deconvolutional single shot detector ［EB/OL］. （2017-01-23）［2020-12-04］. .
19	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition ［EB/OL］. （2015-04-10）［2020-12-06］. . 10.5244/c.28.6
20	HE K M， ZHANG X Y， REN S Q， et al. Deep residual leaning for image recognition ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
21	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. （2018-04-08）［2020-12-10］.. 10.1109/cvpr.2018.00430
22	赵爽，黄怀玉，胡一鸣，等.基于深度学习的无人机航拍车辆检测［J］.计算机应用，2019，39（S2）：91-96.
	ZHAO S， HUANG H Y， HU Y M， et al. Vehicle detection in satellite imagery based on deep learning ［J］. Journal of Computer Applications， 2019， 39（S2）： 91-96.
23	LI M， ZHANG Z J， LEI L P， et al. Agricultural greenhouses detection in high-resolution satellite images based on convolutional neural networks： comparison of faster R-CNN， YOLO v3 and SSD［J］. Sensors， 2020， 20（17）： Article No.4938. 10.3390/s20174938
24	刘英杰，杨风暴，胡鹏.基于Cascade R-CNN的并行特征金字塔网络无人机航拍图像目标检测算法［J］.激光与光电子学进展，2020，57（20）：302-309. 10.3788/lop57.201505
	LIU Y J， YANG F B， HU P. Parallel FPN algorithm based on Cascade R-CNN for object detection from UAV aerial lmages ［J］. Laser & Optoelectronics Progress， 2020， 57（20）： 302-309. 10.3788/lop57.201505
25	ZHOU B L， KHOSLA A， LAPEDRIZA A， et al. Learning deep features for discriminative localization ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2921-2929. 10.1109/cvpr.2016.319

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[3]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[4]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[5]	张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333.
[6]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[7]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[8]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[9]	王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994.
[10]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[11]	高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242.
[12]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[13]	刘瑞华, 郝子赫, 邹洋杨. 基于多层级精细特征融合的步态识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2250-2257.
[14]	孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215.
[15]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.

单阶段多框检测器无人机航拍目标识别方法

Single shot multibox detector recognition method for aerial targets of unmanned aerial vehicle

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 18

参考文献 25

相关文章 15

编辑推荐

Metrics