基于动态双注意力机制的跨模态行人重识别模型

doi:10.11772/j.issn.1001-9081.2021081510

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (10): 3200-3208.DOI: 10.11772/j.issn.1001-9081.2021081510

所属专题：多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于动态双注意力机制的跨模态行人重识别模型

李大伟¹^,², 曾智勇¹^,²

^1.福建师范大学计算机与网络空间安全学院，福州 350117
^2.福建师范大学数字福建大数据安全技术研究所，福州 350117

收稿日期:2021-08-24 修回日期:2021-12-06 接受日期:2021-12-06 发布日期:2022-01-07 出版日期:2022-10-10
通讯作者: 曾智勇
作者简介:第一联系人：李大伟（1997—），男，安徽六安人，硕士研究生，主要研究方向：行人重识别
曾智勇（1965—），男，江西龙南人，副教授，博士，主要研究方向：目标检测、人脸识别。zzyong@fjnu.edu.cn

Cross-modal person re-identification model based on dynamic dual-attention mechanism

Dawei LI¹^,², Zhiyong ZENG¹^,²

^1.College of Computer and Cyber Security，Fujian Normal University，Fuzhou Fujian 350117，China
^2.Digital Fujian Institute of Big Data Security Technology，Fujian Normal University，Fuzhou Fujian 350117，China

Received:2021-08-24 Revised:2021-12-06 Accepted:2021-12-06 Online:2022-01-07 Published:2022-10-10
Contact: Zhiyong ZENG
About author:LI Dawei, born in 1997， M. S. candidate. His research interests include person re-identification.
ZENG Zhiyong, born in 1965， Ph. D. ， associate professor. His research interests include object detection， face recognition.

摘要/Abstract

摘要：

针对跨模态行人重识别图像间模态差异大的问题，大多数现有方法采用像素对齐、特征对齐来实现图像间的匹配。为进一步提高两种模态图像间的匹配的精度，设计了一个基于动态双注意力机制的多输入双流网络模型。首先，在每个批次的训练中通过增加同一行人在不同相机下的图片，让神经网络在有限的样本中学习到充分的特征信息；其次，利用齐次增强得到灰度图像作为中间桥梁，在保留了可见光图像结构信息的同时消除了颜色信息，而灰度图像的运用弱化了网络对颜色信息的依赖，从而加强了网络模型挖掘结构信息的能力；最后，提出了适用于3个模态间图像的加权六向三元组排序（WSDR）损失，所提损失充分利用了不同视角下的跨模态三元组关系，优化了多个模态特征间的相对距离，并提高了对模态变化的鲁棒性。实验结果表明，在SYSU-MM01数据集上，与动态双注意聚合（DDAG）学习模型相比，所提模型在评价指标Rank-1和平均精确率均值（mAP）上分别提升了4.66和3.41个百分点。

关键词: 跨模态, 行人重识别, 多输入双流网络, 齐次增强, 加权六向三元组排序损失

Abstract:

Focused on the issue that huge modal difference between cross-modal person re-identification images， pixel alignment and feature alignment are commonly utilized by most of the existing methods to realize image matching. In order to further improve the accuracy of matching two modal images， a multi-input dual-stream network model based on dynamic dual-attention mechanism was designed. Firstly， the neural network was able to learn sufficient feature information in a limited number of samples by adding images of the same person taken by different cameras in each training batch. Secondly， the gray-scale image obtained by homogeneous augmentation was used as an intermediate bridge to retain the structural information of the visible light images and eliminate the color information at the same time. The use of gray-scale images weakened the network’s dependence on color information， thereby strengthening the network model’s ability to mine structural information. Finally， a Weighted Six-Directional triple Ranking （WSDR） loss suitable for images three modalities was proposed， which made full use of cross-modal triple relationship under different angles of view， optimized relative distance between multiple modal features and improved the robustness to modal changes. Experimental results on SYSU-MM01 dataset show that the proposed model increases evaluation indexes Rank-1 and mean Average Precision （mAP） by 4.66 and 3.41 percentage points respectively compared to Dynamic Dual-attentive AGgregation （DDAG） learning model.

Key words: cross-modal, person re-identification, multi-input dual-stream network, homogeneous augmentation, Weighted Six-Directional triple Ranking (WSDR) loss

中图分类号:

TP391

李大伟, 曾智勇. 基于动态双注意力机制的跨模态行人重识别模型[J]. 计算机应用, 2022, 42(10): 3200-3208.

Dawei LI, Zhiyong ZENG. Cross-modal person re-identification model based on dynamic dual-attention mechanism[J]. Journal of Computer Applications, 2022, 42(10): 3200-3208.

图/表 12

参考文献 39

1	宋婉茹，赵晴晴，陈昌红，等. 行人重识别研究综述［J］. 智能系统学报， 2017， 12（6）：770-780. 10.11992/tis.201706084
	SONG W R， ZHAO Q Q， CHEN C H， et al. Survey on pedestrian re-identification research［J］. CAAI Transactions on Intelligent Systems， 2017， 12（6）：770-780. 10.11992/tis.201706084
2	YE M， SHEN J B， SHAO L. Visible-infrared person re-identification via homogeneous augmented tri-modal learning［J］. IEEE Transactions on Information Forensics and Security， 2021， 16： 728-739. 10.1109/tifs.2020.3001665
3	DAI P， JI R， WANG H， et al. Cross-modality person re-identification with generative adversarial training［C］// Proceedings of the 27th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2018： 677-683. 10.24963/ijcai.2018/94
4	WANG G A， ZHANG T Z， CHENG J， et al. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3622-3631. 10.1109/iccv.2019.00372
5	YE M， SHEN J B， CRANDALL D J， et al. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12362. Cham： Springer， 2020： 229-247.
6	YI D， LEI Z， LIAO S C， et al. Deep metric learning for person re-identification［C］// Proceedings of the 22nd International Conference on Pattern Recognition. Piscataway： IEEE， 2014： 34-39. 10.1109/icpr.2014.16
7	JÜNGLING K， BODENSTEINER C， ARENS M. Person re-identification in multi-camera networks［C］// Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2011： 55-61. 10.1109/cvprw.2011.5981771
8	ZHENG L， BIE Z， SUN Y F， et al. MARS： a video benchmark for large-scale person re-identification［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9910. Cham： Springer， 2016： 868-884.
9	FELZENSZWALB P， McALLESTER D， RAMANAN D. A discriminatively trained， multiscale， deformable part model［C］// Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2008： 1-8. 10.1109/cvpr.2008.4587597
10	ZHENG W S， GONG S G， XIANG T. Reidentification by relative distance comparison［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2013， 35（3）： 653-668. 10.1109/tpami.2012.138
11	WEINBERGER K Q， SAUL L K. Distance metric learning for large margin nearest neighbor classification［J］. Journal of Machine Learning Research， 2009， 10：207-244.
12	LIAO S C， HU Y， ZHU X Y， et al. Person re-identification by local maximal occurrence representation and metric learning［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 2197-2206. 10.1109/cvpr.2015.7298832
13	ZHENG W S， GONG S G， XIANG T. Person re-identification by probabilistic relative distance comparison［C］// Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2011： 649-656. 10.1109/cvpr.2011.5995598
14	PEDAGADI S， ORWELL J， VELASTIN S， et al. Local Fisher discriminant analysis for pedestrian re-identification［C］// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2013： 3318-3325. 10.1109/cvpr.2013.426
15	WANG J Y， ZHU X T， GONG S G， et al. Transferable joint attribute-identity deep learning for unsupervised person re-identification［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 2275-2284. 10.1109/cvpr.2018.00242
16	ZHANG X， LUO H， FAN X， et al. AlignedReID： surpassing human-level performance in person re-identification［EB/OL］. （2018-01-31）［2021-10-10］..
17	SUN Y F， XU Q， LI Y L， et al. Perceive where to focus： learning visibility-aware part-level features for partial person re-identification［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 393-402. 10.1109/cvpr.2019.00048
18	SUN Y F， ZHENG L， YANG Y， et al. Beyond part models： person retrieval with refined part pooling （and a strong convolutional baseline）［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11208. Cham： Springer， 2018： 501-518.
19	ZHENG F， DENG C， SUN X， et al. Pyramidal person re-identification via multi-loss dynamic training［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 8506-8514. 10.1109/cvpr.2019.00871
20	WANG Z X， WANG Z， ZHENG Y Q， et al. Learning to reduce dual-level discrepancy for infrared-visible person re-identification［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 618-626. 10.1109/cvpr.2019.00071
21	YE M， LAN X Y， LI J W， et al. Hierarchical discriminative learning for visible thermal person re-identification［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2018： 7501-7508. 10.1609/aaai.v32i1.12293
22	LI S， XIAO T， LI H S， et al. Identity-aware textual-visual matching with latent co-attention［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 1908-1917. 10.1109/iccv.2017.209
23	PANG L， WANG Y W， SONG Y Z， et al. Cross-domain adversarial feature learning for sketch re-identification［C］// Proceedings of the 26th ACM International Conference on Multimedia. New York： ACM， 2018： 609-617. 10.1145/3240508.3240606
24	WU A C， ZHENG W S， YU H X， et al. RGB-infrared cross-modality person re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 5390-5399. 10.1109/iccv.2017.575
25	YE M， LAN X Y， WANG Z， et al. Bi-directional center-constrained top-ranking for visible thermal person re-identification［J］. IEEE Transactions on Information Forensics and Security， 2020， 15： 407-419. 10.1109/tifs.2019.2921454
26	ZHU Y X， YANG Z， WANG L， et al. Hetero-center loss for cross-modality person Re-identification［J］. Neurocomputing， 2020， 386： 97-109. 10.1016/j.neucom.2019.12.100
27	HAO Y， WANG N N， LI J， et al. HSME： hypersphere manifold embedding for visible thermal person re-identification［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 8385-8392. 10.1609/aaai.v33i01.33018385
28	YE M， LAN X Y， LENG Q M. Modality-aware collaborative learning for visible thermal person re-identification［C］// Proceedings of the 27th ACM International Conference on Multimedia. New York： ACM， 2019： 347-355. 10.1145/3343031.3351043
29	FENG Z X， LAI J H， XIE X H. Learning modality-specific representations for visible-infrared person re-identification［J］. IEEE Transactions on Image Processing， 2020， 29： 579-590. 10.1109/tip.2019.2928126
30	LIU C T， WU C W， WANG Y C F， et al. Spatially and temporally efficient non-local attention network for video-based person re-identification［C］// Proceedings of the 2019 British Machine Vision Conference. Durham： BMVA Press， 2019： No.77. 10.1145/3377170.3377253
31	SHAO R， LAN X Y， LI J W， et al. Multi-adversarial discriminative deep domain generalization for face presentation attack detection［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 10015-10023. 10.1109/cvpr.2019.01026
32	YIN J H， MA Z Y， XIE J Y， et al. DF²AM： dual-level feature fusion and affinity modeling for RGB-infrared cross-modality person re-identification［EB/OL］. （2021-04-01）［2021-06-10］.. 10.1016/j.neucom.2022.09.077
33	HARWOOD B， VIJAY K B G， CARNEIRO G， et al. Smart mining for deep metric learning［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2840-2848. 10.1109/iccv.2017.307
34	WANG Y M， CHOI J， MORARIU V I， et al. Mining discriminative triplets of patches for fine-grained classification［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 1163-1172. 10.1109/cvpr.2016.131
35	WANG C， ZHANG Q， HUANG C， et al. Mancs： a multi-task attentional network with curriculum sampling for person re-identification［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11208. Cham： Springer， 2018： 384-400.
36	NGUYEN D T， HONG H G， KIM K W， et al. Person recognition system based on a combination of body images from visible light and thermal cameras［J］. Sensors， 2017， 17（3）： No.605. 10.3390/s17030605
37	HERMANS A， BEYER L， LEIBE B. In defense of the triplet loss for person re-identification［EB/OL］. （2017-11-21）［2020-10-10］.. 10.21203/rs.3.rs-1501673/v1
38	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2005： 886-893. 10.1109/cvpr.2005.4
39	YE M， SHEN J B， LIN G J， et al. Deep learning for person re-identification： a survey and outlook［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（6）： 2872-2893.

模式	全局搜索					室内搜索
模式	r=1	r=5	r=10	r=20	mAP	r=1	r=5	r=10	r=20	mAP
B	0.547 5	0.823 1	0.903 9	0.958 1	0.530 2	0.610 2	0.871 3	0.940 6	0.984 1	0.679 8
B+H0	0.568 1	0.825 7	0.912 4	0.964 1	0.534 2	0.629 1	0.881 0	0.935 6	0.979 1	0.689 9
B+DHHI	0.572 4	0.829 3	0.915 7	0.966 4	0.542 5	0.636 2	0.888 5	0.941 2	0.982 1	0.691 3
B+DHHI+SDR	0.593 7	0.852 3	0.929 8	0.972 4	0.563 1	0.650 7	0.899 5	0.956 7	0.986 5	0.715 5
B+DHHI+WSDR	0.594 1	0.854 9	0.934 5	0.975 8	0.564 3	0.652 5	0.901 1	0.959 5	0.989 7	0.718 9

模式	全局搜索					室内搜索
模式	r=1	r=5	r=10	r=20	mAP	r=1	r=5	r=10	r=20	mAP
B	0.547 5	0.823 1	0.903 9	0.958 1	0.530 2	0.610 2	0.871 3	0.940 6	0.984 1	0.679 8
B+H0	0.568 1	0.825 7	0.912 4	0.964 1	0.534 2	0.629 1	0.881 0	0.935 6	0.979 1	0.689 9
B+DHHI	0.572 4	0.829 3	0.915 7	0.966 4	0.542 5	0.636 2	0.888 5	0.941 2	0.982 1	0.691 3
B+DHHI+SDR	0.593 7	0.852 3	0.929 8	0.972 4	0.563 1	0.650 7	0.899 5	0.956 7	0.986 5	0.715 5
B+DHHI+WSDR	0.594 1	0.854 9	0.934 5	0.975 8	0.564 3	0.652 5	0.901 1	0.959 5	0.989 7	0.718 9

损失策略	全局搜索		室内搜索
损失策略	r=1	mAP	r=1	mAP
Triplet（Hard）^［37］	0.539 1	0.517 6	0.585 7	0.658 9
WTDR ^［2］	0.564 2	0.533 2	0.625 4	0.687 2
WSDR	0.582 3	0.550 8	0.641 0	0.703 9

损失策略	全局搜索		室内搜索
损失策略	r=1	mAP	r=1	mAP
Triplet（Hard）^［37］	0.539 1	0.517 6	0.585 7	0.658 9
WTDR ^［2］	0.564 2	0.533 2	0.625 4	0.687 2
WSDR	0.582 3	0.550 8	0.641 0	0.703 9

策略	全局搜索		室内搜索
策略	r=1	mAP	r=1	mAP
Base	0.573 3	0.542 6	0.634 1	0.685 0
Base+IWPA	0.583 9	0.550 4	0.641 2	0.693 4
Base+CGSA	0.573 5	0.548 0	0.635 0	0.690 3
Base+IWPA+CGSA	0.594 1	0.564 3	0.652 5	0.718 9

基于动态双注意力机制的跨模态行人重识别模型

Cross-modal person re-identification model based on dynamic dual-attention mechanism

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 39

相关文章 15

编辑推荐

Metrics

方法	全局搜索				室内搜索
方法	r=1	r=10	r=20	mAP	r=1	r=10	r=20	mAP
HOG^［38］	0.027 6	0.183 0	0.319 0	0.424 0	0.032 2	0.247 0	0.445 0	0.072 5
LOMO^［12］	0.036 4	0.232 0	0.373 0	0.045 3	0.057 5	0.344 0	0.549 0	0.102 0
Zero-Padding^［24］	0.148 0	0.541 0	0.713 0	0.159 0	0.206 0	0.684 0	0.858 0	0.269 0
eBDTR^［25］	0.278 2	0.673 4	0.813 4	0.284 2	0.324 6	0.774 2	0.896 2	0.424 6
HSME^［27］	0.206 8	0.327 4	0.779 5	0.231 2	―	―	―	―
D²RL^［20］	0.289 0	0.706 0	0.824 0	0.292 0	―	―	―	―
MAC^［28］	0.332 6	0.790 4	0.900 9	0.362 2	0.364 3	0.623 6	0.716 3	0.370 3
MSR^［29］	0.373 5	0.834 0	0.933 4	0.381 1	0.396 4	0.892 9	0.976 6	0.508 8
AlignGAN^［4］	0.424 0	0.850 0	0.937 0	0.407 0	0.459 0	0.876 0	0.944 0	0.543 0
AGW^［39］	0.475 0	0.843 9	0.921 4	0.476 5	0.541 7	0.911 4	0.959 8	0.629 7
DDAG^［5］	0.547 5	0.903 9	0.958 1	0.530 2	0.610 2	0.940 6	0.984 1	0.679 8
BADIN	0.594 1	0.934 5	0.975 8	0.564 3	0.652 5	0.959 5	0.989 7	0.718 9

[1]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[2]	王翠, 邓淼磊, 张德贤, 李磊, 杨晓艳. 基于图像的端到端行人搜索算法综述[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2544-2550.
[3]	吴祖成, 吴小俊, 徐天阳. 基于模态内细粒度特征关系提取的图像文本检索模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3776-3783.
[4]	李牧, 杨宇恒, 柯熙政. 基于混合特征提取与跨模态特征预测融合的情感识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 86-93.
[5]	刘秋杰, 万源, 吴杰. 深度双模态源域对称迁移学习的跨模态检索[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 24-31.
[6]	黄懿蕊, 罗俊玮, 陈景强. 基于对比学习和GIF标记的多模态对话回复检索[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 32-38.
[7]	何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2182-2189.
[8]	郭玉彬, 文向, 刘攀, 李西明. 基于双流结构的跨模态行人重识别关系网络[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1803-1810.
[9]	谭钰, 王小琴, 蓝如师, 刘振丙, 罗笑南. 基于判别性矩阵分解的多标签跨模态哈希检索[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1349-1354.
[10]	张广耀, 宋纯锋. 融合人体全身表观特征的行人头部跟踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1372-1377.
[11]	孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 744-751.
[12]	姚英茂, 姜晓燕. 基于图卷积网络与自注意力图池化的视频行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 728-735.
[13]	钟建华, 邱创一, 巢建树, 明瑞成, 钟剑锋. 基于语义引导自注意力网络的换衣行人重识别模型[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3719-3726.
[14]	王晓雨, 王展青, 熊威. 深度非对称离散跨模态哈希方法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2461-2470.
[15]	陈代丽, 许国良. 基于注意力机制学习域内变化的跨域行人重识别方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1391-1397.

模型	训练一个Epoch所用的时间/s	参数量/10⁶
DDAG	299.821	362.48
BADIN	736.375	363.53

模型	训练一个Epoch所用的时间/s	参数量/10⁶
DDAG	299.821	362.48
BADIN	736.375	363.53

方法	可见光到红外				红外到可见光
方法	r=1	r=10	r=20	mAP	r=1	r=10	r=20	mAP
Zero-Padding^［24］	0.177 5	0.342 1	0.443 5	0.189 0	0.166 3	0.346 8	0.442 5	0.178 2
eBDTR^［25］	0.346 2	0.589 6	0.687 2	0.334 6	0.342 1	0.587 4	0.686 4	0.178 2
HSME^［27］	0.508 5	0.733 6	0.816 6	0.470 0	0.501 5	0.724 0	0.810 7	0.324 9
D²RL^［20］	0.434 0	0.661 0	0.763 0	0.441 0	―	―	―	0.461 6
MAC^［28］	0.364 3	0.623 6	0.716 3	0.370 3	0.362 0	0.616 8	0.709 9	0.366 3
MSR^［29］	0.484 3	0.703 2	0.799 5	0.486 7	―	―	―	―
AlignGAN^［4］	0.579 0	―	―	0.536 0	0.563 0	―	―	0.534 0
DDAG^［5］	0.693 4	0.861 9	0.914 9	0.634 6	0.680 6	0.851 5	0.903 1	0.618 0
BADIN	0.705 3	0.875 7	0.927 9	0.667 6	0.692 7	0.864 3	0.912 2	0.653 7

方法	可见光到红外				红外到可见光
方法	r=1	r=10	r=20	mAP	r=1	r=10	r=20	mAP
Zero-Padding^［24］	0.177 5	0.342 1	0.443 5	0.189 0	0.166 3	0.346 8	0.442 5	0.178 2
eBDTR^［25］	0.346 2	0.589 6	0.687 2	0.334 6	0.342 1	0.587 4	0.686 4	0.178 2
HSME^［27］	0.508 5	0.733 6	0.816 6	0.470 0	0.501 5	0.724 0	0.810 7	0.324 9
D²RL^［20］	0.434 0	0.661 0	0.763 0	0.441 0	―	―	―	0.461 6
MAC^［28］	0.364 3	0.623 6	0.716 3	0.370 3	0.362 0	0.616 8	0.709 9	0.366 3
MSR^［29］	0.484 3	0.703 2	0.799 5	0.486 7	―	―	―	―
AlignGAN^［4］	0.579 0	―	―	0.536 0	0.563 0	―	―	0.534 0
DDAG^［5］	0.693 4	0.861 9	0.914 9	0.634 6	0.680 6	0.851 5	0.903 1	0.618 0
BADIN	0.705 3	0.875 7	0.927 9	0.667 6	0.692 7	0.864 3	0.912 2	0.653 7