基于深度卷积神经网络的航空器检测与识别

doi:10.11772/j.issn.1001-9081.2017.06.1702

计算机应用 ›› 2017, Vol. 37 ›› Issue (6): 1702-1707.DOI: 10.11772/j.issn.1001-9081.2017.06.1702

基于深度卷积神经网络的航空器检测与识别

俞汝劼¹, 杨贞¹, 熊惠霖^1,2

1. 上海交通大学电子信息与电气工程学院, 上海 200240;
2. 上海交通大学计算机模式识别实验室, 上海 200240

收稿日期:2016-10-12 修回日期:2017-02-10 发布日期:2017-06-14 出版日期:2017-06-10
通讯作者: 俞汝劼
作者简介:俞汝劼(1992-),男,上海人,硕士研究生,主要研究方向:图像解译与评估;杨贞(1985-),男,山东菏泽人,博士研究生,主要研究方向:模式识别、计算机视觉;熊惠霖(1964-),男,湖北黄冈人,教授,博士,主要研究方向:基于核方法的非线性模式识别和机器学习、图像处理、机器视觉、生物信息学。
基金资助:
国家自然科学基金资助项目（61375008）。

Aircraft detection and recognition based on deep convolutional neural network

YU Rujie¹, YANG Zhen¹, XIONG Huilin^1,2

1. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
2. Computer Pattern Recognition Laboratory, Shanghai Jiao Tong University, Shanghai 200240, China

Received:2016-10-12 Revised:2017-02-10 Online:2017-06-14 Published:2017-06-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61375008).

摘要/Abstract

摘要： 针对军用机场大尺寸卫星图像中航空器检测识别的具体应用场景，建立了一套实时目标检测识别框架，将深度卷积神经网络应用到大尺寸图像中的航空器目标检测与识别任务中。首先，将目标检测的任务看成空间上独立的bounding-box的回归问题，用一个24层卷积神经网络模型来完成bounding-box的预测；然后，利用图像分类网络来完成目标切片的分类任务。大尺寸图像上的传统目标检测识别算法通常在时间效率上很难突破，而基于卷积神经网络的航空器目标检测识别算法充分利用了计算硬件的优势，大大缩短了任务耗时。在符合应用场景的自采数据集上进行测试，所提算法目标检测实时性达到平均每张5.765 s，在召回率65.1%的工作点上达到了79.2%的精确率，分类网络的实时性达到平均每张0.972 s，Top-1错误率为13%。所提框架在军用机场大尺寸卫星图像中航空器检测识别的具体应用问题上提出了新的解决思路，同时保证了实时性和算法精度。

关键词: 深度学习, 卷积神经网络, 航空器检测, 目标检测识别

Abstract: Aiming at the specific application scenario of aircraft detection in large-scale satellite images of military airports, a real-time target detection and recognition framework was proposed. The deep Convolutional Neural Network (CNN) was applied to the target detection task and recognition task of aircraft in large-scale satellite images. Firstly, the task of aircraft detection was regarded as a regression problem of the spatially independent bounding-box, and a 24-layer convolutional neural network model was used to complete the bounding-box prediction. Then, an image classification network was used to complete the classification task of the target slices. The traditional target detection and recognition algorithm on large-scale images is usually difficult to make a breakthrough in time efficiency. The proposed target detection and recognition framework of aircraft based on CNN makes full use of the advantages of computing hardware greatly and shortens the executing time. The proposed framework was tested on a self-collected data set consistent with application scenarios. The average time of the proposed framework is 5.765 s for processing each input image, meanwhile, the precision is 79.2% at the operating point with the recall of 65.1%. The average time of the classification network is 0.972 s for each image and the Top-1 error rate is 13%. The proposed framework provides a new solution for application problem of aircraft detection in large-scale satellite images of military airports with relatively high efficiency and precision.

Key words: deep learning, Convolutional Neural Network (CNN), aircraft detection, target detection and recognition

中图分类号:

TP391.41

俞汝劼, 杨贞, 熊惠霖. 基于深度卷积神经网络的航空器检测与识别[J]. 计算机应用, 2017, 37(6): 1702-1707.

YU Rujie, YANG Zhen, XIONG Huilin. Aircraft detection and recognition based on deep convolutional neural network[J]. Journal of Computer Applications, 2017, 37(6): 1702-1707.

参考文献

[1] 卢宏涛,张秦川.深度卷积神经网络在计算机视觉中的应用研究综述[J].数据采集与处理,2016,31(1):1-17.(LU H T, ZHANG Q C. Applications of deep convolutional neural network in computer vision[J]. Journal of Data Acquisition and Processing, 2016, 31(1):1-17.)
[2] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//CVPR'05:Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE, 2005:886-893.
[3] FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D. Cascade object detection with deformable part models[C]//Proceedings of the 2010 IEEE conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE, 2010:2241-2248.
[4] GALL J, LEMPITSKY V. Class-specific hough forests for object detection[M]//Decision Forests for Computer Vision and Medical Image Analysis. London:Springer, 2013:143-157.
[5] LEIBE B, LEONARDIS A, SCHIELE B. Combined object categorization and segmentation with an implicit shape model[EB/OL].[2016-09-10]. http://vision.stanford.edu/cs598_spring07/papers/LeibeSchiele2004.pdf.
[6] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-444.
[7] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786):504-507.
[8] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2014:580-587.
[9] GIRSHICK R. Fast R-CNN[C]//ICCV'15:Proceedings of the 2015 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2015:1440-1448.
[10] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]//Proceedings of the 2015 International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2015:91-99.
[11] 尤玮,戴声奎.基于多特征与改进霍夫森林的行人检测方法[J].计算机工程与设计,2014,35(10):3538-3544.(YOU W, DAI S K. Pedestrian detection algorithm using multiple features and improved Houghforest[J]. Computer Engineering and Design, 2014,35(10):3538-3544.)
[12] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.
[13] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:779-788.
[14] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//NIPS'12:Proceedings of the 2012 25th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2012:1097-1105.
[15] HUBEL D H, WIESEL T N. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex[J]. Journal of Physiology, 1962, 160(1):106-154.
[16] FELLEMAN D J, VAN ESSEN D C. Distributed hierarchical processing in the primate cerebral cortex[J]. Cerebral Cortex, 1991, 1(1):1-47.
[17] FUKUSHIMA K, MIYAKE S. Neocognitron:a new algorithm for pattern recognition tolerant of deformations and shifts in position[J]. Pattern Recognition, 1982, 15(6):455-469.
[18] HE K M, ZHANG X Y, REN S Q, et al. Delving deep into rectifiers:surpassing human-level performance on imagenet classification[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2015:1026-1034.
[19] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2015:1-9.

基于深度卷积神经网络的航空器检测与识别

Aircraft detection and recognition based on deep convolutional neural network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[2]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[3]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[4]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[5]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[6]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[7]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[8]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[9]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[10]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[11]	顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625.
[12]	石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650.
[13]	赵亦群, 张志禹, 董雪. 基于密集残差物理信息神经网络的各向异性旅行时计算方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2310-2318.
[14]	高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242.
[15]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.