《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (10): 3191-3199.DOI: 10.11772/j.issn.1001-9081.2021081518

• 多媒体计算与计算机仿真 • 上一篇    

基于多视角多监督网络的无人机图像定位方法

周金坤1, 王先兰1, 穆楠2, 王晨3()   

  1. 1.武汉邮电科学研究院, 武汉 430074
    2.四川师范大学 计算机科学学院, 成都 610101
    3.南京烽火天地通信科技有限公司, 南京 210019
  • 收稿日期:2021-08-26 修回日期:2021-12-14 接受日期:2021-12-14 发布日期:2022-01-07 出版日期:2022-10-10
  • 通讯作者: 王晨
  • 作者简介:周金坤(1995—),男,湖北荆州人,硕士研究生,主要研究方向:深度学习、计算机视觉
    王先兰(1969—),女,湖北公安人,高级工程师,主要研究方向:人工智能、数据通信
    穆楠(1991—),男,河南南阳人,讲师,博士,主要研究方向:图像处理、计算机视觉
    王晨(1979—),男,江苏南京人,高级工程师,硕士,主要研究方向:网络安全、深度学习。
  • 基金资助:
    国家自然科学基金资助项目(62006165)

Unmanned aerial vehicle image localization method based on multi-view and multi-supervision network

Jinkun ZHOU1, Xianlan WANG1, Nan MU2, Chen WANG3()   

  1. 1.Wuhan Research Institute of Posts and Telecommunications,Wuhan Hubei 430074,China
    2.College of Computer Science,Sichuan Normal University,Chengdu Sichuan 610101,China
    3.Nanjing Fiberhome Tiandi Communication Technology Company Limited,Nanjing Jiangsu 210019,China
  • Received:2021-08-26 Revised:2021-12-14 Accepted:2021-12-14 Online:2022-01-07 Published:2022-10-10
  • Contact: Chen WANG
  • Supported by:
    National Natural Science Foundation of China(62006165)

摘要:

针对现有跨视角图像匹配算法精度低的问题,提出了一种基于多视角多监督网络(MMNet)的无人机(UAV)定位方法。首先,所提方法融合卫星视角和UAV视角,在统一的网络架构下学习全局和局部特征并以多监督方式训练分类网络并执行度量任务。具体来说,MMNet主要采用了重加权正则化三元组损失(RRT)学习全局特征,该损失利用重加权和距离正则化加权策略来解决多视角样本不平衡以及特征空间结构紊乱的问题。同时,为了关注目标地点中心建筑的上下文信息,MMNet对特征图进行方形环切割来获取局部特征。然后,分别用交叉熵损失和RRT执行分类和度量任务。最终,使用加权策略聚合全局和局部特征来表征目标地点图像。通过在当前流行的UAV数据集University-1652上进行实验,可知MMNet在UAV定位任务的召回率Recall@1 (R@1)及平均精准率(AP)上分别达到83.97%和86.96%。实验结果表明,相较于LCM、SFPN等方法,MMNet显著提升了跨视角图像的匹配精度,进而增强了UAV图像定位的实用性。

关键词: 无人机图像定位, 跨视角图像匹配, 地理定位, 度量学习, 深度学习

Abstract:

Aiming at the problem of low accuracy of the existing cross-view image matching algorithms, an Unmanned Aerial Vehicle (UAV) image localization method based on Multi-view and Multi-supervision Network (MMNet) was proposed. Firstly, in the proposed method, satellite perspective and UAV perspective were integrated, global and local features were learnt under a unified network architecture, then classification network was trained and metric tasks were performed in multi-supervision way. Specifically, the Reweighted Regularization Triplet loss (RRT) was mainly used by MMNet to learn global features. In this loss, the reweighting and distance regularization strategies were to solve the problems of imbalance of multi-view samples and structure disorder of the feature space. Simultaneously, in order to pay attention to the context information of the central building in target location, the local features were obtained by MMNet via square ring cutting. After that, the cross entropy loss and RRT were used to perform classification and metric tasks respectively. Finally, the global and local features were aggregated by using a weighted strategy to present target location images. MMNet achieved Recall@1 (R@1) of 83.97% and Average Precision (AP) of 86.96% in UAV localization tasks on the currently popular UAV dataset University-1652. Experimental results show that MMNet significantly improves the accuracy of cross-view image matching, and then enhances the practicability of UAV image localization compared with LCM (cross-view Matching based on Location Classification), SFPN (Salient Feature Partition Network) and other methods.

Key words: Unmanned Aerial Vehicle (UAV) image localization, cross-view image matching, geo-localization, metric learning, deep learning

中图分类号: