Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1625-1635.DOI: 10.11772/j.issn.1001-9081.2022040541

• Frontier and comprehensive applications • Previous Articles     Next Articles

Cross-view geo-localization method based on multi-task joint learning

Xianlan WANG1, Jinkun ZHOU1, Nan MU2, Chen WANG3()   

  1. 1.Wuhan Research Institute of Posts and Telecommunications,Wuhan Hubei 430074,China
    2.College of Computer Science,Sichuan Normal University,Chengdu Sichuan 610101,China
    3.Nanjing Fiberhome Tiandi Communication Technology Company Limited,Nanjing Jiangsu 210019,China
  • Received:2022-04-18 Revised:2022-07-04 Accepted:2022-07-05 Online:2022-08-12 Published:2023-05-10
  • Contact: Chen WANG
  • About author:WANG Xianlan, born in 1969, senior engineer. Her research interests include artificial intelligence, data communication.
    ZHOU Jinkun, born in 1995, M. S. candidate. His research interests include deep learning, computer vision.
    MU Nan, born in 1991, Ph. D., lecturer. His research interests include image processing, computer vision.
    WANG Chen, born in 1979, M. S., senior engineer. His research interests include network security, deep learning.
  • Supported by:
    National Natural Science Foundation of China(62006165)

基于多任务联合学习的跨视角地理定位方法

王先兰1, 周金坤1, 穆楠2, 王晨3()   

  1. 1.武汉邮电科学研究院, 武汉 430074
    2.四川师范大学 计算机科学学院, 成都 610101
    3.南京烽火天地通信科技有限公司, 南京 210019
  • 通讯作者: 王晨
  • 作者简介:王先兰(1969—),女,湖北荆州人,高级工程师,主要研究方向:人工智能、数据通信
    周金坤(1995—),男,湖北荆州人,硕士研究生,主要研究方向:深度学习、计算机视觉
    穆楠(1991—),男,河南南阳人,讲师,博士,主要研究方向:图像处理、计算机视觉
    王晨(1979—),男,江苏南京人,高级工程师,硕士,主要研究方向:网络安全、深度学习。wangchen5005@fiberhome.com
  • 基金资助:
    国家自然科学基金资助项目(62006165)

Abstract:

Multi-task Joint Learning Model (MJLM) was proposed to solve the performance improvement bottleneck problem caused by the separation of viewpoint-invariant feature and view transformation method in the existing cross-view geo-localization methods. MJLM was made up of a proactive image generative model and a posterior image retrieval model. In the proactive generative model, firstly, Inverse Perspective Mapping (IPM) for coordinate transformation was used to explicitly bridge the spatial domain difference so that the spatial geometric features of the projected image and the real satellite image were approximately the same. Then, the proposed Cross-View Generative Adversarial Network (CVGAN) was used to match and restore the image contents and textures at a fine-grained level implicitly and synthesize smoother and more real satellite images. The posterior retrieval model was composed of Multi-view and Multi-supervision Network (MMNet), which could perform image retrieval tasks with multi-scale features and multi-supervised learning. Experimental results on Unmanned Aerial Vehicle (UAV) dataset University-1652 show that MJLM achieves the Average Precision (AP) of 89.22% and Recall (R@1) of 87.54%, respectively. Compared with LPN (Local Pattern Network) and MSBA (MultiScale Block Attention), MJLM has the R@1 improved by 15.29% and 1.07% respectively. It can be seen that MJLM processes the cross-view image synthesis and retrieval tasks together to realize the fusion of view transformation and viewpoint-invariant feature methods in an aggregation, improves the precision and robustness of cross-view geo-localization significantly and verifies the feasibility of the UAV localization.

Key words: cross-view geo-localization, Unmanned Aerial Vehicle (UAV) image localization, view transformation, feature extraction, deep learning

摘要:

针对现有跨视角地理定位方法中视点不变特征与视角转换方法割裂导致的性能提升瓶颈问题,提出多任务联合学习模型(MJLM)。MJLM由前置图像生成模型和后置图像检索模型组成。前置生成模型首先使用逆透视映射(IPM)进行坐标变换,显式地弥合空间域差,使投影图像与真实卫星图的空间几何特征大致相同;然后通过提出的跨视角生成对抗网络(CVGAN)隐式地对图像内容及纹理进行细粒度的匹配和修复,并合成出更平滑且真实的卫星图像。后置检索模型由多视角多监督网络(MMNet)构成,能够兼顾多尺度特征和多监督学习的图像检索任务。在University-1652(无人机定位数据集)上进行实验,结果显示MJLM对无人机(UAV)定位任务的平均精确率(AP)及召回率(R@1)分别达到89.22%和87.54%,与LPN (Local Pattern Network)和MSBA (MultiScale Block Attention)相比,MJLM在R@1上分别提升了15.29%和1.07%。可见,MJLM能在一个聚合框架体系内联合处理跨视角图像生成任务及检索任务,实现基于视角转换与视点不变特征方法的融合,有效提升跨视角地理定位的精度和鲁棒性,验证UAV定位的可行性。

关键词: 跨视角地理定位, 无人机图像定位, 视角转换, 特征提取, 深度学习

CLC Number: