基于多视角多监督网络的无人机图像定位方法

doi:10.11772/j.issn.1001-9081.2021081518

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (10): 3191-3199.DOI: 10.11772/j.issn.1001-9081.2021081518

• 多媒体计算与计算机仿真 • 上一篇

基于多视角多监督网络的无人机图像定位方法

周金坤¹, 王先兰¹, 穆楠², 王晨³()

^1.武汉邮电科学研究院, 武汉 430074
^2.四川师范大学计算机科学学院, 成都 610101
^3.南京烽火天地通信科技有限公司, 南京 210019

收稿日期:2021-08-26 修回日期:2021-12-14 接受日期:2021-12-14 发布日期:2022-01-07 出版日期:2022-10-10
通讯作者: 王晨
作者简介:周金坤（1995—），男，湖北荆州人，硕士研究生，主要研究方向：深度学习、计算机视觉
王先兰（1969—），女，湖北公安人，高级工程师，主要研究方向：人工智能、数据通信
穆楠（1991—），男，河南南阳人，讲师，博士，主要研究方向：图像处理、计算机视觉
王晨（1979—），男，江苏南京人，高级工程师，硕士，主要研究方向：网络安全、深度学习。
基金资助:
国家自然科学基金资助项目(62006165)

Unmanned aerial vehicle image localization method based on multi-view and multi-supervision network

Jinkun ZHOU¹, Xianlan WANG¹, Nan MU², Chen WANG³()

^1.Wuhan Research Institute of Posts and Telecommunications，Wuhan Hubei 430074，China
^2.College of Computer Science，Sichuan Normal University，Chengdu Sichuan 610101，China
^3.Nanjing Fiberhome Tiandi Communication Technology Company Limited，Nanjing Jiangsu 210019，China

Received:2021-08-26 Revised:2021-12-14 Accepted:2021-12-14 Online:2022-01-07 Published:2022-10-10
Contact: Chen WANG
Supported by:
National Natural Science Foundation of China(62006165)

摘要/Abstract

摘要：

针对现有跨视角图像匹配算法精度低的问题，提出了一种基于多视角多监督网络（MMNet）的无人机（UAV）定位方法。首先，所提方法融合卫星视角和UAV视角，在统一的网络架构下学习全局和局部特征并以多监督方式训练分类网络并执行度量任务。具体来说，MMNet主要采用了重加权正则化三元组损失（RRT）学习全局特征，该损失利用重加权和距离正则化加权策略来解决多视角样本不平衡以及特征空间结构紊乱的问题。同时，为了关注目标地点中心建筑的上下文信息，MMNet对特征图进行方形环切割来获取局部特征。然后，分别用交叉熵损失和RRT执行分类和度量任务。最终，使用加权策略聚合全局和局部特征来表征目标地点图像。通过在当前流行的UAV数据集University-1652上进行实验，可知MMNet在UAV定位任务的召回率Recall@1 （R@1）及平均精准率（AP）上分别达到83.97%和86.96%。实验结果表明，相较于LCM、SFPN等方法，MMNet显著提升了跨视角图像的匹配精度，进而增强了UAV图像定位的实用性。

关键词: 无人机图像定位, 跨视角图像匹配, 地理定位, 度量学习, 深度学习

Abstract:

Aiming at the problem of low accuracy of the existing cross-view image matching algorithms， an Unmanned Aerial Vehicle （UAV） image localization method based on Multi-view and Multi-supervision Network （MMNet） was proposed. Firstly， in the proposed method， satellite perspective and UAV perspective were integrated， global and local features were learnt under a unified network architecture， then classification network was trained and metric tasks were performed in multi-supervision way. Specifically， the Reweighted Regularization Triplet loss （RRT） was mainly used by MMNet to learn global features. In this loss， the reweighting and distance regularization strategies were to solve the problems of imbalance of multi-view samples and structure disorder of the feature space. Simultaneously， in order to pay attention to the context information of the central building in target location， the local features were obtained by MMNet via square ring cutting. After that， the cross entropy loss and RRT were used to perform classification and metric tasks respectively. Finally， the global and local features were aggregated by using a weighted strategy to present target location images. MMNet achieved Recall@1 （R@1） of 83.97% and Average Precision （AP） of 86.96% in UAV localization tasks on the currently popular UAV dataset University-1652. Experimental results show that MMNet significantly improves the accuracy of cross-view image matching， and then enhances the practicability of UAV image localization compared with LCM （cross-view Matching based on Location Classification）， SFPN （Salient Feature Partition Network） and other methods.

Key words: Unmanned Aerial Vehicle (UAV) image localization, cross-view image matching, geo-localization, metric learning, deep learning

中图分类号:

TP391.4

周金坤, 王先兰, 穆楠, 王晨. 基于多视角多监督网络的无人机图像定位方法[J]. 计算机应用, 2022, 42(10): 3191-3199.

Jinkun ZHOU, Xianlan WANG, Nan MU, Chen WANG. Unmanned aerial vehicle image localization method based on multi-view and multi-supervision network[J]. Journal of Computer Applications, 2022, 42(10): 3191-3199.

图/表 9

图1 无人机定位和导航任务示意图

Fig. 1 Schematic diagram of UAV image localization and navigation tasks

图2 MMNet架构

Fig. 2 Architecture of MMNet

图3 方形环切割策略

Fig. 3 Square ring cutting strategy

表1 University-1652数据集上本文方法与前沿方法的比较 (%)

Tab. 1 Comparison of the proposed method with state-of-the-art methods on University-1652 dataset

方法	无人机→卫星		卫星→无人机
方法	R@1	AP	R@1	AP
IL^［6］	58.23	62.91	74.47	59.45
LCM^［27］	66.65	70.82	79.89	65.38
SFPN^［38］	70.83	77.36	80.26	71.58
LPN^［29］	75.93	79.14	86.45	74.79
MMNet	83.97	86.96	90.15	84.69
MMNet（distractors）	81.15	84.92	―	―

图4 无人机图像定位和无人机导航的检索结果

Fig. 4 Search results of UAV image localization and navigation tasks

表2 University-1652数据集上MMNet不同模块的比较结果 (%)

Tab. 2 Comparison results of different MMNet modules on University-1652 dataset

方法	无人机→卫星		卫星→无人机
方法	R@1	AP	R@1	AP
MMNet	83.97	86.96	90.15	84.69
GF（ID）	64.02	69.14	79.61	63.08
LF（ID）	78.53	82.61	88.30	76.89
JF（ID）	79.65	84.46	88.82	79.11
GF（RRT）	64.96	70.92	72.88	64.12

表3 MMNet采用不同采样策略的结果 (%)

Tab. 3 Results of different sampling strategies in MMNet

采样策略	无人机→卫星		卫星→无人机
采样策略	R@1	AP	R@1	AP
批量挖掘^［6］	82.93	83.53	87.46	82.05
MBM	83.97	86.96	90.15	84.69

表4 RTT与其他度量损失的比较 (%)

Tab. 4 Comparison of RRT with other metric losses

方法	无人机→卫星		卫星→无人机
方法	R@1	AP	R@1	AP
CL^［6］	52.39	57.44	63.91	52.24
TL^［6］	55.18	59.97	64.48	53.15
WSM^［21］	53.21	58.03	65.62	54.47
RRT	57.26	61.82	65.48	55.82
RRT+MBM	59.93	64.96	67.70	58.21

图5 超参数β对无人机图像定位任务性能的影响

Fig. 5 Influence of hyperparameter β on performance of UAV image localization task

参考文献 38

1	WU Z C， NI M， HU Z W， et al. Mapping invasive plant with UAV-derived 3D mesh model in mountain area—a case study in Shenzhen Coast， China［J］. International Journal of Applied Earth Observation and Geoinformation， 2019， 77： 129-139. 10.1016/j.jag.2018.12.001
2	DENG L， MAO Z H， LI X J， et al. UAV-based multispectral remote sensing for precision agriculture： a comparison between different cameras［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2018， 146： 124-136. 10.1016/j.isprsjprs.2018.09.008
3	YAN Y N， DENG L， LIU X L， et al. Application of UAV-based multi-angle hyperspectral remote sensing in fine vegetation classification［J］. Remote Sensing， 2019， 11（23）： No.2753. 10.3390/rs11232753
4	赵爽，黄怀玉，胡一鸣，等. 基于深度学习的无人机航拍车辆检测［J］. 计算机应用， 2019， 39（S2）：91-96.
	ZHAO S， HUANG H Y， HU Y M， et al. Vehicle detection in satellite imagery based on deep learning［J］. Journal of Computer Applications， 2019， 39（S2）： 91-96.
5	LIU W， YANG M Y， XIE M， et al. Accurate building extraction from fused DSM and UAV images using a chain fully convolutional neural network［J］. Remote Sensing， 2019， 11（24）： No.2912. 10.3390/rs11242912
6	ZHENG Z D， WEI Y C， YANG Y. University-1652： a multi-view multi-source benchmark for drone-based geo-localization［C］// Proceedings of the 28th ACM International Conference on Multimedia. New York： ACM， 2020： 1395-1403. 10.1145/3394171.3413896
7	VO N， JACOBS N， HAYS J. Revisiting IM2GPS in the deep learning era［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2640-2649. 10.1109/iccv.2017.286
8	SATTLER T， HAVLENA M， SCHINDLER K， et al. Large-scale location recognition and the geometric burstiness problem［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 1582-1590. 10.1109/cvpr.2016.175
9	HAYS J， EFROS A A. IM2GPS： estimating geographic information from a single image［C］// Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2008： 1-8. 10.1109/cvpr.2008.4587784
10	SHI Y J， YU X， CAMPBELL D， et al. Where am I looking at？ joint location and orientation estimation by cross-view matching［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 4063-4071. 10.1109/cvpr42600.2020.00412
11	TOKER A， ZHOU Q J， MAXIMOV M， et al. Coming down to earth： satellite-to-street view synthesis for geo-localization［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 6484-6493. 10.1109/cvpr46437.2021.00642
12	ZHU S J， YANG T J N， CHEN C. VIGOR： cross-view image geo-localization beyond one-to-one retrieval［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 5316-5325. 10.1109/cvpr46437.2021.00364
13	SHI Y J， YU X， LIU L， et al. Optimal feature transport for cross-view image geo-localization［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 11990-11997. 10.1609/aaai.v34i07.6875
14	LOWE D G. Distinctive image features from scale-invariant keypoints［J］. International Journal of Computer Vision， 2004， 60（2）： 91-110. 10.1023/b:visi.0000029664.99615.94
15	孙鹏，肖经，赵海盟，等. 基于DSP的无人机遥感影像SIFT算法设计与实现［J］. 计算机应用， 2020， 40（4）：1237-1242. 10.11772/j.issn.1001-9081.2019091689
	SUN P， XIAO J， ZHAO H M， et al. Design and implementation of SIFT algorithm for UAV remote sensing image based on DSP platform［J］. Journal of Computer Applications， 2020， 40（4）： 1237-1242. 10.11772/j.issn.1001-9081.2019091689
16	BAY H， TUYTELAARS T， GOOL L V. SURF： speeded up robust features［C］// Proceedings of the 2006 European Conference on Computer Vision， LNCS 3951. Berlin： Springer， 2006： 404-417.
17	ZHAI M H， BESSINGER Z， WORKMAN S， et al. Predicting ground-level scene layout from aerial imagery［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 4132-4140. 10.1109/cvpr.2017.440
18	WORKMAN S， JACOBS N. On the location dependence of convolutional neural network features［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2015： 70-78. 10.1109/cvprw.2015.7301385
19	TIAN Y C， CHEN C， SHAH M. Cross-view image matching for geo-localization in urban environments［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1998-2006. 10.1109/cvpr.2017.216
20	VO N N， HAYS J. Localizing and orienting street views using overhead imagery［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 494-509.
21	HU S X， FENG M D， NGUYEN R M H， et al. CVM-Net： cross-view matching network for image-based ground-to-aerial geo-localization［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7258-7267. 10.1109/cvpr.2018.00758
22	ARANDJELOVIC R， GRONAT P， TORII A， et al. NetVLAD： CNN architecture for weakly supervised place recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 5297-5307. 10.1109/cvpr.2016.572
23	REGMI K， SHAH M. Bridging the domain gap for ground-to-aerial image matching［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision . Piscataway： IEEE， 2019： 470-479. 10.1109/iccv.2019.00056
24	WANG T C， LIU M Y， ZHU J Y， et al. High-resolution image synthesis and semantic manipulation with conditional GANs［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Piscataway： IEEE， 2018： 8798-8807. 10.1109/cvpr.2018.00917
25	CAI S D， GUO Y L， KHAN S， et al. Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8390-8399. 10.1109/iccv.2019.00848
26	LIU L， LI H D. Lending orientation to neural networks for cross-view geo-localization［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 5617-5626. 10.1109/cvpr.2019.00577
27	DING L R， ZHOU J， MENG L X， et al. A practical cross-view image matching method between UAV and satellite for UAV-based geo-localization［J］. Remote Sensing， 2020， 13（1）： No.47. 10.3390/rs13010047
28	HU S Y， CHANG X J. Multi-view drone-based geo-localization via style and spatial alignment［EB/OL］. （2020-07-09）［2021-05-28］..
29	WANG T Y， ZHENG Z D， YAN C G， et al. Each part matters： local patterns facilitate cross-view geo-localization［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2022， 32（2）： 867-879. 10.1109/tcsvt.2021.3061265
30	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［EB/OL］. （2015-04-10）［2021-05-28］..
31	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
32	RADENOVIĆ F， TOLIAS G， CHUM O. Fine-tuning CNN image retrieval with no human annotation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2019， 41（7）： 1655-1668. 10.1109/tpami.2018.2846566
33	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 815-823. 10.1109/cvpr.2015.7298682
34	LIAO W T， YANG M Y， ZHAN N， et al. Triplet-based deep similarity learning for person re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway： IEEE， 2017： 385-393. 10.1109/iccvw.2017.52
35	HERMANS A， BEYER L， LEIBE B. In defense of the triplet loss for person re-identification［EB/OL］. （2017-11-27）［2021-05-28］.. 10.21203/rs.3.rs-1501673/v1
36	YU B S， LIU T L， GONG M M， et al. Correcting the triplet selection bias for triplet loss［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11210. Cham： Springer， 2018：71-86.
37	GORDO A， ALMAZÁN J， REVAUD J， et al. End-to-end learning of deep visual representations for image retrieval［J］. International Journal of Computer Vision， 2017， 124（2）： 237-254. 10.1007/s11263-017-1016-8
38	HE S J， WANG Y H. Cross-view geo-localization via salient feature partition network［J］. Journal of Physics： Conference Series， 2021， 1914： No.012009. 10.1088/1742-6596/1914/1/012009

[1]	李敬虎, 邢前国, 郑向阳, 李琳, 王丽丽. 基于深度学习的无人机影像夜光藻赤潮提取方法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2969-2974.
[2]	魏佳璇, 杜世康, 于志轩, 张瑞生. 图像分类中的白盒对抗攻击技术综述[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2732-2741.
[3]	尹靖涵, 瞿绍军, 姚泽楷, 胡玄烨, 秦晓雨, 华璞靖. 基于YOLOv5的雾霾天气下交通标志识别模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2876-2884.
[4]	王一宁, 赵青杉, 秦品乐, 胡玉兰, 宗春梅. 基于轻量密集神经网络的医学图像超分辨率重建算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2586-2592.
[5]	张显杰, 张之明. 基于卷积神经网络和Transformer的手写体英文文本识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2394-2400.
[6]	程南江, 余贞侠, 陈琳, 乔贺辙. 基于领域自适应的多源多标签行人属性识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2401-2406.
[7]	刘亚姣, 于海涛, 王江, 于利峰, 张春晖. 基于深度学习的型钢表面多形态微小缺陷检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2601-2608.
[8]	秦庭威, 赵鹏程, 秦品乐, 曾建朝, 柴锐, 黄永琦. 基于残差注意力机制的点云配准算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2184-2191.
[9]	韩亚茹, 闫连山, 姚涛. 基于元学习的深度哈希检索算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2015-2021.
[10]	王震宇, 张雷, 高文彬, 权威铭. 基于渐进式神经网络架构搜索的人体运动识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2058-2064.
[11]	董宁, 程晓荣, 张铭泉. 基于物联网平台的动态权重损失函数入侵检测系统[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2118-2124.
[12]	刘万军, 王佳铭, 曲海成, 董利兵, 曹欣宇. 基于频谱空间域特征注意的音乐流派分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2072-2077.
[13]	韩玉民, 郝晓燕. 基于子词嵌入和相对注意力的材料实体识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1862-1868.
[14]	于蒙, 何文涛, 周绪川, 崔梦天, 吴克奇, 周文杰. 推荐系统综述[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1898-1913.
[15]	李佳, 郑元林, 廖开阳, 楼豪杰, 李世宇, 陈泽豪. 基于显著性深层特征的无参考图像质量评价算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1957-1964.

基于多视角多监督网络的无人机图像定位方法

Unmanned aerial vehicle image localization method based on multi-view and multi-supervision network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 38

相关文章 15

编辑推荐

Metrics