计算机应用 ›› 2020, Vol. 40 ›› Issue (12): 3520-3525.DOI: 10.11772/j.issn.1001-9081.2020040466

• 人工智能 • 上一篇    下一篇

同一场景下超大尺度差异物体的识别和定位方法

王一婷1, 张柯2, 李捷2, 郝宗波1, 段昶3,4, 朱策4   

  1. 1. 电子科技大学 信息与软件工程学院, 成都 610054;
    2. 四川九洲电器集团有限责任公司, 四川 绵阳, 621000;
    3. 西南石油大学 电气信息工程学院, 成都 610500;
    4. 电子科技大学 信息与通信工程学院, 成都 611731
  • 收稿日期:2020-04-14 修回日期:2020-06-08 出版日期:2020-12-10 发布日期:2020-08-11
  • 通讯作者: 郝宗波(1977-),男,河南新乡人,副教授,博士,主要研究方向:视频图像处理。zbhao@uestc.edu.cn
  • 作者简介:王一婷(1996-),女,四川眉山人,硕士研究生,主要研究方向:图像处理;张柯(1983-),男,四川绵阳人,高级工程师,硕士,研究方向:信号与信息处理、人工智能;李捷(1969-),女,山东青州人,正高级工程师,博士,主要研究方向:通信工程、人工智能;段昶(1979-),男,四川遂宁人,讲师,博士,主要研究方向:图像处理、人工智能;朱策(1969-),男,四川自贡人,教授,博士,CCF会员,主要研究方向:图像处理、人工智能
  • 基金资助:
    中央军委装备发展部武器装备预研基金资助项目(41412010201);中央高校基本科研业务费资助项目(A03013023001049)。

Recognition and localization method of super-large-scale variance objects in the same scene

WANG Yiting1, ZHANG Ke2, LI Jie2, HAO Zongbo1, DUAN Chang3,4, ZHU Ce4   

  1. 1. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 610054, China;
    2. Sichuan Jiuzhou Electric Group Company Limited, Mianyang Sichuan 621000, China;
    3. School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu Sichuan 610500, China;
    4. School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China
  • Received:2020-04-14 Revised:2020-06-08 Online:2020-12-10 Published:2020-08-11
  • Supported by:
    This work is partially supported by the Weapons and Equipment Pre-research Fund of Equipment Development Department of Central Military Commission (41412010201), the Fundamental Research Funds for the Central Universities (A03013023001049).

摘要: 近年来,深度学习在物体检测方面取得了非常好的效果和突飞猛进的发展,但在某些特殊场景下,如要求同时检测尺度相差极大的目标物体(相差大于100倍)时,现有的物体识别方法的性能急剧下降。针对同一场景下超大尺度差异物体识别与定位问题,对YOLOv3框架进行了改进,结合图像金字塔技术来提取图像的多尺度特征;并在训练过程中,针对不同尺度的目标提出采用动态交并比(IoU)的策略,此策略可以更好地解决样本不均衡的问题。实验结果表明,该模型对同一场景下超大超小物体的识别能力有了明显的提升。将之应用于机场环境,取得了较好的应用效果。

关键词: 超大尺度差异, 物体识别, YOLOv3, 动态交并比, 深度学习

Abstract: In recent years, deep learning achieves very good results and has great improvement in object detection. However, in some special scenes, for example, when it is required to simultaneously detect objects with greatly different scales (difference greater than 100 times), common object recognition methods' performance will drop dramatically. Aiming at the problem of recognizing and locating objects with super-large-scale variance in the same scene, the You Only Look Once version3 (YOLOv3) framework was improved, the image pyramid technology was combined to extract the multi-scale features of the image. And in the training process, the strategy of using dynamic Intersection over Union (IoU) was proposed for different scale objects, which was able to better solve the problem of sample imbalance. Experimental results show that the proposed model significantly improves the recognition ability of super-large and super-small objects in the same scene. The proposed model has been applied to the airport environment and achieved good application results.

Key words: super-large-scale variance, object recognition, You Only Look Once version3 (YOLOv3), dynamic Intersection over Union (IoU), deep learning

中图分类号: