基于多模态图卷积神经网络的行人重识别方法

doi:10.11772/j.issn.1001-9081.2022060827

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2182-2189.DOI: 10.11772/j.issn.1001-9081.2022060827

基于多模态图卷积神经网络的行人重识别方法

何嘉明¹, 杨巨成¹(), 吴超¹, 闫潇宁², 许能华²

^1.天津科技大学人工智能学院，天津 300457
^2.深圳市安软科技股份有限公司，广东深圳 518131

收稿日期:2022-06-10 修回日期:2022-09-02 接受日期:2022-09-09 发布日期:2022-10-11 出版日期:2023-07-10
通讯作者: 杨巨成
作者简介:何嘉明（1995—），男，广东清远人，硕士研究生，CCF会员，主要研究方向：行人重识别；
杨巨成（1980—），男，湖北天门人，教授，博士，CCF杰出会员，主要研究方向：图像处理、模式识别；
吴超（1974—），男，湖北咸宁人，讲师，博士，主要研究方向：图像处理、模式识别；
闫潇宁（1989—），男，山西大同人，硕士，主要研究方向：视频图像人工智能；
许能华（1982—），男，江西上饶人，主要研究方向：视频图像人工智能。

Person re-identification method based on multi-modal graph convolutional neural network

Jiaming HE¹, Jucheng YANG¹(), Chao WU¹, Xiaoning YAN², Nenghua XU²

^1.College of Artificial Intelligence，Tianjin University of Science and Technology，Tianjin 300457，China
^2.Shenzhen Softsz Technology Company Limited，Shenzhen Guangdong 518131，China

Received:2022-06-10 Revised:2022-09-02 Accepted:2022-09-09 Online:2022-10-11 Published:2023-07-10
Contact: Jucheng YANG
About author:HE Jiaming， born in 1995， M. S. candidate. His research interests include person re-identification.
YANG Jucheng， born in 1980， Ph. D.， professor. His research interests include image processing， pattern recognition.
WU Chao， born in 1974， Ph. D.， lecturer. His research interests include image processing， pattern recognition.
YAN Xiaoning， born in 1989， M. S. His research interests include artificial intelligence of video image.
XU Nenghua， born in 1982. His research interests include artificial intelligence of video image.

摘要/Abstract

摘要：

针对行人重识别中行人文本属性信息未被充分利用以及文本属性之间语义联系未被挖掘的问题，提出一种基于多模态的图卷积神经网络（GCN）行人重识别方法。首先使用深度卷积神经网络（DCNN）学习行人文本属性与行人图像特征；然后借助GCN有效的关系挖掘能力，将文本属性特征与图像特征作为GCN的输入，通过图卷积运算来传递文本属性节点间的语义信息，从而学习文本属性间隐含的语义联系信息，并将该语义信息融入图像特征中；最后GCN输出鲁棒的行人特征。该多模态的行人重识别方法在Market-1501数据集上获得了87.6%的平均精度均值（mAP）和95.1%的Rank-1准确度；在DukeMTMC-reID数据集上获得了77.3%的mAP和88.4%的Rank-1准确度，验证了所提方法的有效性。

关键词: 行人重识别, 多模态, 图卷积神经网络, 行人文本属性, 隐含语义联系

Abstract:

Aiming at the problems that person textual attribute information is not fully utilized and the semantic relationships among the textual attributes are not mined in person re-identification， a person re-identification method based on multi-modal Graph Convolutional neural Network （GCN） was proposed. Firstly， Deep Convolutional Neural Network （DCNN） was used to learn person textual attributes and person image features. Then， with the help of the effective relationship mining ability of GCN， the textual attribute features and image features were treated as the input of GCN， and the semantic information of the textual attribute nodes was transferred through the graph convolution operation， so as to learn the implicit semantic relationship information among the textual attributes and incorporate this semantic information into image features. Finally， the robust person features were output by GCN. The multi-modal person re-identification method achieves the mean Average Precision （mAP） of 87.6% and the Rank-1 accuracy of 95.1% on Market-1501 dataset， and achieves the mAP of 77.3% and the Rank-1 accuracy of 88.4% on DukeMTMC-reID dataset， which verify the effectiveness of the proposed method.

Key words: person re-identification, multi-modal, Graph Convolutional neural Network (GCN), person textual attribute, potential semantic relationship

中图分类号:

TP391.41

何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 计算机应用, 2023, 43(7): 2182-2189.

Jiaming HE, Jucheng YANG, Chao WU, Xiaoning YAN, Nenghua XU. Person re-identification method based on multi-modal graph convolutional neural network[J]. Journal of Computer Applications, 2023, 43(7): 2182-2189.

图/表 6

图1 本文方法总览

Fig. 1 Overview of the proposed method

图2 行人文本属性的相关矩阵

Fig. 2 Correlation matrix of person textual attributes

表1 Market-1501与DukeMTMC-reID数据集上的实验结果对比 ( %)

Tab. 1 Comparison of experimental results on Market-1501 and DukeMTMC-reID datasets

方法	Market-1501		DukeMTMC-reID
方法	mAP	Rank-1	mAP	Rank-1
HA-CNN^［15］	75.7	91.2	63.8	80.5
PCB^［12］	77.3	92.4	69.2	83.3
OSNet^［13］	84.9	94.8	73.5	88.6
APR^［20］	66.9	87.0	55.5	73.9
ACRN^［21］	62.6	83.4	52.0	72.6
AANet^［22］	82.5	93.9	72.6	86.4
SGGNN^［23］	82.8	92.3	68.2	81.1
MGAT^［24］	76.5	91.5	—	—
HOReID^［25］	84.9	94.2	75.6	86.9
CLFA^［27］	85.9	94.5	76.4	86.4
DCC^［28］	88.6	—	—	—
Base-CNN（本文方法）	85.6	94.2	76.2	86.3
MMGCN（本文方法）	87.6	95.1	77.3	88.4

图3 不同前m个属性的实验结果

Fig. 3 Experimental results of different top-m attributes

图4 不同图卷积层数的实验结果

Fig. 4 Experimental result of different graph convolutional layers

图5 可视化实验结果

Fig. 5 Visualized experimental results

参考文献 43

1	KHAMIS S， KUO C H， SINGH V K， et al. Joint learning for attribute-consistent person re-identification ［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 8927. Cham： Springer， 2015： 134-146.
2	KÖSTINGER M， HIRZER M， WOHLHART P， et al. Large scale metric learning from equivalence constraints ［C］// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2012： 2288-2295. 10.1109/cvpr.2012.6247939
3	LI W， WANG X G. Locally aligned feature transforms across views ［C］// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2013： 3594-3601. 10.1109/cvpr.2013.461
4	ZHENG W S， GONG S G， XIANG T. Reidentification by relative distance comparison［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2013， 35（3）： 653-668. 10.1109/tpami.2012.138
5	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection ［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway： IEEE， 2005： 886-893. 10.1109/cvpr.2005.4
6	LOWE D G. Object recognition from local scale-invariant features ［C］// Proceedings of the 7th IEEE International Conference on Computer Vision — Volume 2. Piscataway： IEEE， 1999： 1150-1157. 10.1109/iccv.1999.790410
7	WU Z H， PAN S R， CHEN F W， et al. A comprehensive survey on graph neural networks［J］. IEEE Transactions on Neural Networks and Learning Systems， 2021， 32（1）： 4-24. 10.1109/tnnls.2020.2978386
8	杨永胜，邓淼磊，李磊，等.基于深度学习的行人重识别综述［J］.计算机工程与应用， 2022， 58（9）： 51-66. 10.3778/j.issn.1002-8331.2110-0300
	YANG Y S， DENG M L， LI L， et al. Overview of pedestrian re-identification based on deep learning［J］. Computer Engineering and Applications， 2022， 58（9）： 51-66. 10.3778/j.issn.1002-8331.2110-0300
9	杨锋，许玉，尹梦晓，等.基于深度学习的行人重识别综述［J］.计算机应用， 2020， 40（5）： 1243-1252. 10.11772/j.issn.1001-9081.2019091703
	YANG F， XU Y， YIN M X， et al. Review on deep learning-based pedestrian re-identification［J］. Journal of Computer Applications， 2020， 40（5）： 1243-1252. 10.11772/j.issn.1001-9081.2019091703
10	罗浩，姜伟，范星，等.基于深度学习的行人重识别研究进展［J］.自动化学报， 2019， 45（11）： 2032-2049. 10.16383/j.aas.c180154
	LUO H， JIANG W， FAN X， et al. A survey on deep learning based person re-identification［J］. Acta Automatica Sinica， 2019， 45（11）： 2032-2049. 10.16383/j.aas.c180154
11	LI W， ZHAO R， XIAO T， et al. DeepReID： deep filter pairing neural network for person re-identification ［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 152-159. 10.1109/cvpr.2014.27
12	SUN Y F， ZHENG L， YANG Y， et al. Beyond part models： person retrieval with refined part pooling （and a strong convolutional baseline）［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11208. Cham： Springer， 2018： 501-518.
13	ZHOU K Y， YANG Y X， CAVALLARO A， et al. Omni-scale feature learning for person re-identification ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3701-3711. 10.1109/iccv.2019.00380
14	邓轩，廖开阳，郑元林，等.基于深度多视图特征距离学习的行人重识别［J］.计算机应用， 2019， 39（8）： 2223-2229. 10.11772/j.issn.1001-9081.2018122505
	DENG X， LIAO K Y， ZHENG Y L， et al. Person re-identification based on deep multi-view feature distance learning［J］. Journal of Computer Applications， 2019， 39（8）： 2223-2229. 10.11772/j.issn.1001-9081.2018122505
15	LI W， ZHU X T， GONG S G. Harmonious attention network for person re-identification ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 2285-2294. 10.1109/cvpr.2018.00243
16	LI S， BAK S， CARR P， et al. Diversity regularized spatiotemporal attention for video-based person re-identification ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 369-378. 10.1109/cvpr.2018.00046
17	LIU H， FENG J S， QI M B， et al. End-to-end comparative attention networks for person re-identification［J］. IEEE Transactions on Image Processing， 2017， 26（7）： 3492-3506. 10.1109/tip.2017.2700762
18	LIU X H， ZHAO H Y， TIAN M Q， et al. HydraPlus-Net： attentive deep features for pedestrian analysis ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 350-359. 10.1109/iccv.2017.46
19	刘紫燕，万培佩.基于注意力机制的行人重识别特征提取方法［J］.计算机应用， 2020， 40（3）： 672-676.
	LIU Z Y， WAN P P. Pedestrian re-identification feature extraction method based on attention mechanism［J］. Journal of Computer Applications， 2020， 40（3）： 672-676.
20	LIN Y T， ZHENG L， ZHENG Z D， et al. Improving person re-identification by attribute and identity learning［J］. Pattern Recognition， 2019， 95： 151-161. 10.1016/j.patcog.2019.06.006
21	SCHUMANN A， STIEFELHAGEN R. Person re-identification by deep learning attribute-complementary information ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2017： 1435-1443. 10.1109/cvprw.2017.186
22	TAY C P， ROY S， YAP K H. AANet： attribute attention network for person re-Identifications ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7127-7136. 10.1109/cvpr.2019.00730
23	SHEN Y T， LI H S， YI S， et al. Person re-identification with deep similarity-guided graph neural network ［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11219. Cham： Springer， 2018： 508-526.
24	BAO L Q， MA B P， CHANG H， et al. Masked graph attention network for person re-identification ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2019： 1496-1505. 10.1109/cvprw.2019.00191
25	WANG G A， YANG S， LIU H Y， et al. High-order information matters： learning relation and topology for occluded person re-identification ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition， Piscataway： IEEE， 2020： 6448-6457. 10.1109/cvpr42600.2020.00648
26	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks［EB/OL］. （2017-02-22）［2022-03-22］. . 10.48550/arXiv.1609.02907
27	CHEN Q Y， ZHANG W， FAN J P. Cluster-level feature alignment for person re-identification［EB/OL］. （2020-08-15）［2022-07-12］. .
28	YAO H T， XU C S. Dual cluster contrastive learning for object re-identification［EB/OL］. （2022-04-21）［2022-07-12］. .
29	ZHENG L， SHEN L Y， TIAN L， et al. Scalable person re-identification： a benchmark ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1116-1124. 10.1109/iccv.2015.133
30	RISTANI E， SOLERA F， ZOU R， et al. Performance measures and a data set for multi-target， multi-camera tracking ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9914. Cham： Springer， 2016： 17-35.
31	ROSENBLATT F. The perceptron： a probabilistic model for information storage and organization in the brain［J］. Psychological Review， 1958， 65（6）： 386-408. 10.1037/h0042519
32	JIA J， HUANG H J， YANG W J， et al. Rethinking of pedestrian attribute recognition： realistic datasets and a strong baseline［EB/OL］. （2020-05-26）［2022-03-24］. .
33	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
34	LI D W， CHEN X T， ZHANG Z， et al. Pose guided deep model for pedestrian attribute recognition in surveillance scenarios ［C］// Proceedings of the 2018 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2018： 1-6. 10.1109/icme.2018.8486604
35	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality ［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems — Volume 2. Red Hook， NY： Curran Associates Inc.， 2013： 3111-3119.
36	PENNINGTON J， SOCHER R， MANNING C D. GloVe： global vectors for word representation ［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2014： 1532-1543. 10.3115/v1/d14-1162
37	ZHANG G H， LIANG G Y， SU F， et al. Cross-domain attribute representation based on convolutional neural network ［C］// Proceedings of the 2018 International Conference on Intelligent Computing， LNCS 10956. Cham： Springer， 2018： 134-142.
38	XU S M， LUO L K， HU S Q. Attention-based model with attribute classification for cross-domain person re-identification ［C］// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway： IEEE， 2021： 9149-9155. 10.1109/icpr48806.2021.9413309
39	XU B L， LIU J X， HOU X X， et al. Cross domain person re-identification with large scale attribute annotated datasets［J］. IEEE Access， 2019， 7： 21623-21634. 10.1109/access.2019.2896663
40	XIAO Q Q， CAO K L， CHEN H N， et al. Cross domain knowledge transfer for person re-identification［EB/OL］. （2016-11-18）［2022-03-25］. .
41	LUO H， GU Y Z， LIAO X Y， et al. Bag of tricks and a strong baseline for deep person re-identification ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2019： 1487-1495. 10.1109/cvprw.2019.00190
42	WEN Y D， ZHANG K P， LI Z F， et al. A discriminative feature learning approach for deep face recognition ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9911. Cham： Springer， 2016： 499-515.
43	LI X Y， JIANG S Q. Know more say less： image captioning based on scene graphs［J］. IEEE Transactions on Multimedia， 2019， 21（8）： 2117-2130. 10.1109/tmm.2019.2896516

[1]	林剑, 叶璟轩, 刘雯雯, 邵晓雯. 求解带容量约束车辆路径问题的多模态差分进化算法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2248-2254.
[2]	郭玉彬, 文向, 刘攀, 李西明. 基于双流结构的跨模态行人重识别关系网络[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1803-1810.
[3]	张广耀, 宋纯锋. 融合人体全身表观特征的行人头部跟踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1372-1377.
[4]	樊小宇, 蔺素珍, 王彦博, 刘峰, 李大威. 基于残差图卷积神经网络的高倍欠采样核磁共振图像重建算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1261-1268.
[5]	王惠茹, 李秀红, 李哲, 马春明, 任泽裕, 杨丹. 多模态预训练模型综述[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 991-1004.
[6]	孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 744-751.
[7]	李路宝, 陈田, 任福继, 罗蓓蓓. 基于图神经网络和注意力的双模态情感识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 700-705.
[8]	姚英茂, 姜晓燕. 基于图卷积网络与自注意力图池化的视频行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 728-735.
[9]	王若莹, 吕凡, 赵柳清, 胡伏原. 融合用户需求和边界约束的平面图生成算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 575-582.
[10]	孙晓飞, 朱静远, 陈斌, 游恒志. 融合多模态数据的药物合成反应的虚拟筛选[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 622-629.
[11]	孙梦迪, 孙忠贵, 孔旭, 韩红燕. 针对多模态图像的自适应引导形态学设计[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 560-566.
[12]	吴明晖, 张广洁, 金苍宏. 基于多模态信息融合的时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2326-2332.
[13]	韩滕跃, 牛少彰, 张文. 基于对比学习的多模态序列推荐算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1683-1688.
[14]	陈代丽, 许国良. 基于注意力机制学习域内变化的跨域行人重识别方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1391-1397.
[15]	陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200.

基于多模态图卷积神经网络的行人重识别方法

Person re-identification method based on multi-modal graph convolutional neural network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 43

相关文章 15

编辑推荐

Metrics