《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2182-2189.DOI: 10.11772/j.issn.1001-9081.2022060827

• 人工智能 • 上一篇    下一篇

基于多模态图卷积神经网络的行人重识别方法

何嘉明1, 杨巨成1(), 吴超1, 闫潇宁2, 许能华2   

  1. 1.天津科技大学 人工智能学院,天津 300457
    2.深圳市安软科技股份有限公司,广东 深圳 518131
  • 收稿日期:2022-06-10 修回日期:2022-09-02 接受日期:2022-09-09 发布日期:2022-10-11 出版日期:2023-07-10
  • 通讯作者: 杨巨成
  • 作者简介:何嘉明(1995—),男,广东清远人,硕士研究生,CCF会员,主要研究方向:行人重识别;
    杨巨成(1980—),男,湖北天门人,教授,博士,CCF杰出会员,主要研究方向:图像处理、模式识别;
    吴超(1974—),男,湖北咸宁人,讲师,博士,主要研究方向:图像处理、模式识别;
    闫潇宁(1989—),男,山西大同人,硕士,主要研究方向:视频图像人工智能;
    许能华(1982—),男,江西上饶人,主要研究方向:视频图像人工智能。

Person re-identification method based on multi-modal graph convolutional neural network

Jiaming HE1, Jucheng YANG1(), Chao WU1, Xiaoning YAN2, Nenghua XU2   

  1. 1.College of Artificial Intelligence,Tianjin University of Science and Technology,Tianjin 300457,China
    2.Shenzhen Softsz Technology Company Limited,Shenzhen Guangdong 518131,China
  • Received:2022-06-10 Revised:2022-09-02 Accepted:2022-09-09 Online:2022-10-11 Published:2023-07-10
  • Contact: Jucheng YANG
  • About author:HE Jiaming, born in 1995, M. S. candidate. His research interests include person re-identification.
    YANG Jucheng, born in 1980, Ph. D., professor. His research interests include image processing, pattern recognition.
    WU Chao, born in 1974, Ph. D., lecturer. His research interests include image processing, pattern recognition.
    YAN Xiaoning, born in 1989, M. S. His research interests include artificial intelligence of video image.
    XU Nenghua, born in 1982. His research interests include artificial intelligence of video image.

摘要:

针对行人重识别中行人文本属性信息未被充分利用以及文本属性之间语义联系未被挖掘的问题,提出一种基于多模态的图卷积神经网络(GCN)行人重识别方法。首先使用深度卷积神经网络(DCNN)学习行人文本属性与行人图像特征;然后借助GCN有效的关系挖掘能力,将文本属性特征与图像特征作为GCN的输入,通过图卷积运算来传递文本属性节点间的语义信息,从而学习文本属性间隐含的语义联系信息,并将该语义信息融入图像特征中;最后GCN输出鲁棒的行人特征。该多模态的行人重识别方法在Market-1501数据集上获得了87.6%的平均精度均值(mAP)和95.1%的Rank-1准确度;在DukeMTMC-reID数据集上获得了77.3%的mAP和88.4%的Rank-1准确度,验证了所提方法的有效性。

关键词: 行人重识别, 多模态, 图卷积神经网络, 行人文本属性, 隐含语义联系

Abstract:

Aiming at the problems that person textual attribute information is not fully utilized and the semantic relationships among the textual attributes are not mined in person re-identification, a person re-identification method based on multi-modal Graph Convolutional neural Network (GCN) was proposed. Firstly, Deep Convolutional Neural Network (DCNN) was used to learn person textual attributes and person image features. Then, with the help of the effective relationship mining ability of GCN, the textual attribute features and image features were treated as the input of GCN, and the semantic information of the textual attribute nodes was transferred through the graph convolution operation, so as to learn the implicit semantic relationship information among the textual attributes and incorporate this semantic information into image features. Finally, the robust person features were output by GCN. The multi-modal person re-identification method achieves the mean Average Precision (mAP) of 87.6% and the Rank-1 accuracy of 95.1% on Market-1501 dataset, and achieves the mAP of 77.3% and the Rank-1 accuracy of 88.4% on DukeMTMC-reID dataset, which verify the effectiveness of the proposed method.

Key words: person re-identification, multi-modal, Graph Convolutional neural Network (GCN), person textual attribute, potential semantic relationship

中图分类号: