同一场景下超大尺度差异物体的识别和定位方法

doi:10.11772/j.issn.1001-9081.2020040466

计算机应用 ›› 2020, Vol. 40 ›› Issue (12): 3520-3525.DOI: 10.11772/j.issn.1001-9081.2020040466

同一场景下超大尺度差异物体的识别和定位方法

王一婷¹, 张柯², 李捷², 郝宗波¹, 段昶^3,4, 朱策⁴

1. 电子科技大学信息与软件工程学院, 成都 610054;
2. 四川九洲电器集团有限责任公司, 四川绵阳, 621000;
3. 西南石油大学电气信息工程学院, 成都 610500;
4. 电子科技大学信息与通信工程学院, 成都 611731

收稿日期:2020-04-14 修回日期:2020-06-08 发布日期:2020-08-11 出版日期:2020-12-10
通讯作者: 郝宗波(1977-),男,河南新乡人,副教授,博士,主要研究方向:视频图像处理。zbhao@uestc.edu.cn
作者简介:王一婷(1996-),女,四川眉山人,硕士研究生,主要研究方向:图像处理;张柯(1983-),男,四川绵阳人,高级工程师,硕士,研究方向:信号与信息处理、人工智能;李捷(1969-),女,山东青州人,正高级工程师,博士,主要研究方向:通信工程、人工智能;段昶(1979-),男,四川遂宁人,讲师,博士,主要研究方向:图像处理、人工智能;朱策(1969-),男,四川自贡人,教授,博士,CCF会员,主要研究方向:图像处理、人工智能
基金资助:
中央军委装备发展部武器装备预研基金资助项目（41412010201）；中央高校基本科研业务费资助项目（A03013023001049）。

Recognition and localization method of super-large-scale variance objects in the same scene

WANG Yiting¹, ZHANG Ke², LI Jie², HAO Zongbo¹, DUAN Chang^3,4, ZHU Ce⁴

1. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 610054, China;
2. Sichuan Jiuzhou Electric Group Company Limited, Mianyang Sichuan 621000, China;
3. School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu Sichuan 610500, China;
4. School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China

Received:2020-04-14 Revised:2020-06-08 Online:2020-08-11 Published:2020-12-10
Supported by:
This work is partially supported by the Weapons and Equipment Pre-research Fund of Equipment Development Department of Central Military Commission （41412010201）， the Fundamental Research Funds for the Central Universities （A03013023001049）.

摘要/Abstract

摘要： 近年来，深度学习在物体检测方面取得了非常好的效果和突飞猛进的发展，但在某些特殊场景下，如要求同时检测尺度相差极大的目标物体（相差大于100倍）时，现有的物体识别方法的性能急剧下降。针对同一场景下超大尺度差异物体识别与定位问题，对YOLOv3框架进行了改进，结合图像金字塔技术来提取图像的多尺度特征；并在训练过程中，针对不同尺度的目标提出采用动态交并比（IoU）的策略，此策略可以更好地解决样本不均衡的问题。实验结果表明，该模型对同一场景下超大超小物体的识别能力有了明显的提升。将之应用于机场环境，取得了较好的应用效果。

关键词: 超大尺度差异, 物体识别, YOLOv3, 动态交并比, 深度学习

Abstract: In recent years, deep learning achieves very good results and has great improvement in object detection. However, in some special scenes, for example, when it is required to simultaneously detect objects with greatly different scales (difference greater than 100 times), common object recognition methods' performance will drop dramatically. Aiming at the problem of recognizing and locating objects with super-large-scale variance in the same scene, the You Only Look Once version3 (YOLOv3) framework was improved, the image pyramid technology was combined to extract the multi-scale features of the image. And in the training process, the strategy of using dynamic Intersection over Union (IoU) was proposed for different scale objects, which was able to better solve the problem of sample imbalance. Experimental results show that the proposed model significantly improves the recognition ability of super-large and super-small objects in the same scene. The proposed model has been applied to the airport environment and achieved good application results.

Key words: super-large-scale variance, object recognition, You Only Look Once version3 (YOLOv3), dynamic Intersection over Union (IoU), deep learning

中图分类号:

TP391

王一婷, 张柯, 李捷, 郝宗波, 段昶, 朱策. 同一场景下超大尺度差异物体的识别和定位方法[J]. 计算机应用, 2020, 40(12): 3520-3525.

WANG Yiting, ZHANG Ke, LI Jie, HAO Zongbo, DUAN Chang, ZHU Ce. Recognition and localization method of super-large-scale variance objects in the same scene[J]. Journal of Computer Applications, 2020, 40(12): 3520-3525.

参考文献

[1] REDMON J,DIVVALA S,GIRSHICK R,et al. You Only Look Once:unified,real-time object detection[C]//Proceeding of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:779-788
[2] 任培明. 基于YOLO的实时目标检测方法研究[D]. 无锡:江南大学,2019:19-25. (REN P M. Research on real-time target detection method based on YOLO[D]. Wuxi:Jiangnan University,2019:19-25.)
[3] REDMON J,FARHADI A. YOLO9000:better,faster,stronger[C]//Proceeding of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:6517-6525.
[4] REDMON J,FARHADI A. YOLOv3:an incremental improvement[EB/OL].[2019-04-08]. https://arxiv.org/pdf/1804.02767.pdf.
[5] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceeding of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778
[6] TAN M, LE Q V. EfficientNet:rethinking model scaling for convolutional neuralnetworks[EB/OL].[2019-11-22]. https://arxiv.org/pdf/1905.11946.pdf
[7] 闫世武, 罗金良, 严庆. 基于改进的YOLOv3的目标检测的研究[J]. 智能计算机与应用, 2019, 9(6):312-315.(YAN S W,LUO J L, YAN Q. Research on target detection based on improved YOLOv3[J]. Smart Computer and Applications,2019,9(6):312-315.)
[8] GIRSHICK R,DONAHUE J,DARRELL T,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceeding of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2014:580-587
[9] GIRSHICK R. Fast R-CNN[C]//Proceeding of the 2015 IEEE Conference on Computer Vision. Piscataway:IEEE,2015:1440-1448
[10] REN S,HE K,GIRSHICK R,et al. Faster R-CNN:towards realtime object detection with region proposalnetworks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017, 39(6):1137-1149.
[11] LIU W,ANGUELOV D,ERHAN D,et a1. SSD:single shot multibox detector[C]//Proceeding of the 2016 European Conference on Computer Vision,LNCS 9905. Cham:Springer, 2016:21-37.
[12] DUAN K,BAI S,XIE L,et al. CenterNet:keypoint triplets for object detection[C]//Proceeding of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE, 2019:6568-6577
[13] 郭继民. 基于深度神经网络的物体识别方法研究与实现[D]. 成都:电子科技大学, 2018:27-32.(GUO J M. Research and implementation of object recognition method based on deep neuralnetwork[D]. Chengdu:University of Electronic Science and Technology of China,2018:27-32.)
[14] LIN T Y,DOLLÁR P,GIRSHICKR,et al. Feature pyramidnetworks for object detection[C]//Proceeding of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:936-944.
[15] HE K,ZHANG X,REN S,et al. Spatial pyramid pooling in deep convolutionalnetworks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015, 37(9):1904-1916.

同一场景下超大尺度差异物体的识别和定位方法

Recognition and localization method of super-large-scale variance objects in the same scene

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[3]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[4]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[5]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[6]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[7]	顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625.
[8]	石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650.
[9]	吴筝, 程志友, 汪真天, 汪传建, 王胜, 许辉. 基于深度学习的患者麻醉复苏过程中的头部运动幅度分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2258-2263.
[10]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.
[11]	张郅, 李欣, 叶乃夫, 胡凯茜. 基于暗知识保护的模型窃取防御技术DKP[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2080-2086.
[12]	赵亦群, 张志禹, 董雪. 基于密集残差物理信息神经网络的各向异性旅行时计算方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2310-2318.
[13]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[14]	孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215.
[15]	赵雅娟, 孟繁军, 徐行健. 在线教育学习者知识追踪综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1683-1698.