《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (5): 1391-1397.DOI: 10.11772/j.issn.1001-9081.2021030459

• 人工智能 • 上一篇    下一篇

基于注意力机制学习域内变化的跨域行人重识别方法

陈代丽1,2, 许国良1,2()   

  1. 1.重庆邮电大学 通信与信息工程学院, 重庆 400065
    2.重庆邮电大学 电子信息与网络工程研究院, 重庆 400065
  • 收稿日期:2021-03-26 修回日期:2021-06-22 接受日期:2021-06-23 发布日期:2022-06-11 出版日期:2022-05-10
  • 通讯作者: 许国良
  • 作者简介:陈代丽(1996—),女,四川宜宾人,硕士研究生,主要研究方向:计算机视觉
    许国良(1973—),男,浙江金华人,教授,博士,主要研究方向:计算机视觉、大数据分析与挖掘。 xugl@cqupt.edu.cn

Cross-domain person re-identification method based on attention mechanism with learning intra-domain variance

Daili CHEN1,2, Guoliang XU1,2()   

  1. 1.School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    2.Electronic Information and Networking Research Institute,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2021-03-26 Revised:2021-06-22 Accepted:2021-06-23 Online:2022-06-11 Published:2022-05-10
  • Contact: Guoliang XU
  • About author:CHEN Daili, born in 1996, M. S. candidate. Her research interests include computer vision.
    XU Guoliang, born in 1973, Ph. D., professor. His research interests include computer vision, big data analysis and mining.

摘要:

针对行人重识别任务跨域迁移时性能严重衰退的问题,提出了一种基于注意力机制学习域内变化的跨域行人重识别方法。首先,以ResNet50为基础架构并对其进行调整使其适合行人重识别任务,并引入实例-批归一化网络(IBN-Net)以提高模型的泛化能力,同时增加区域注意力分支以提取更具鉴别性的行人特征。对于源域的训练,将其作为分类任务,使用交叉熵损失进行源域的有监督学习,同时引入三元组损失来挖掘源域样本的细节,从而提高源域的分类性能。对于目标域的训练,通过学习域内变化来适应源域和目标域间的数据分布差异。在测试阶段,以ResNet50 pool-5层的输出作为图像特征,并计算查询图像与候选图像间的欧氏距离来度量两者的相似度。在两个大规模公共数据集Market-1501和DukeMTMC-reID上进行实验,所提方法的Rank-1准确率分别达到80.1%和67.7%,平均精度均值(mAP)分别为49.5%和44.2%。实验结果表明,所提方法在提高模型泛化能力方面性能较优。

关键词: 无监督域适应, 域内变化, 行人重识别, 注意力机制, 鉴别特征

Abstract:

To solve severe performance degradation problem of person re-identification task during cross-domain migration, a new cross-domain person re-identification method based on attention mechanism with learning intra-domain variance was proposed. Firstly, ResNet50 was used as the backbone network and some modifications were made to it, so that it was more suitable for person re-identification task. And Instance-Batch Normalization Network (IBN-Net) was introduced to improve the generalization ability of model. At the same time, for the purpose of learning more discriminative features, a region attention branch was added to the backbone network. For the training of source domain, it was treated as a classification task. Cross-entropy loss was utilized for supervised learning of source domain, and triplet loss was introduced to mine the details of source domain samples and improve the classification performance of source domain. For the training of target domain, intra-domain variance was considered to adapt the difference in data distribution between the source domain and the target domain. In the test phase, the output of ResNet50 pool-5 layer was used as image features, and Euclidean distance between query image and candidate image was calculated to measure the similarity of them. In the experiments on two large-scale public datasets of Market-1501 and DukeMTMC-reID, the Rank-1 accuracy of the proposed method is 80.1% and 67.7% respectively, and its mean Average Precision (mAP) is 49.5% and 44.2% respectively. Experimental results show that, the proposed method has better performance in improving generalization ability of model.

Key words: unsupervised domain adaptation, intra-domain variance, person re-identification, attention mechanism, discriminative feature

中图分类号: