《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2182-2189.DOI: 10.11772/j.issn.1001-9081.2022060827
所属专题: 人工智能
收稿日期:
2022-06-10
修回日期:
2022-09-02
接受日期:
2022-09-09
发布日期:
2022-10-11
出版日期:
2023-07-10
通讯作者:
杨巨成
作者简介:
何嘉明(1995—),男,广东清远人,硕士研究生,CCF会员,主要研究方向:行人重识别;
Jiaming HE1, Jucheng YANG1(), Chao WU1, Xiaoning YAN2, Nenghua XU2
Received:
2022-06-10
Revised:
2022-09-02
Accepted:
2022-09-09
Online:
2022-10-11
Published:
2023-07-10
Contact:
Jucheng YANG
About author:
HE Jiaming, born in 1995, M. S. candidate. His research interests include person re-identification.摘要:
针对行人重识别中行人文本属性信息未被充分利用以及文本属性之间语义联系未被挖掘的问题,提出一种基于多模态的图卷积神经网络(GCN)行人重识别方法。首先使用深度卷积神经网络(DCNN)学习行人文本属性与行人图像特征;然后借助GCN有效的关系挖掘能力,将文本属性特征与图像特征作为GCN的输入,通过图卷积运算来传递文本属性节点间的语义信息,从而学习文本属性间隐含的语义联系信息,并将该语义信息融入图像特征中;最后GCN输出鲁棒的行人特征。该多模态的行人重识别方法在Market-1501数据集上获得了87.6%的平均精度均值(mAP)和95.1%的Rank-1准确度;在DukeMTMC-reID数据集上获得了77.3%的mAP和88.4%的Rank-1准确度,验证了所提方法的有效性。
中图分类号:
何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 计算机应用, 2023, 43(7): 2182-2189.
Jiaming HE, Jucheng YANG, Chao WU, Xiaoning YAN, Nenghua XU. Person re-identification method based on multi-modal graph convolutional neural network[J]. Journal of Computer Applications, 2023, 43(7): 2182-2189.
方法 | Market-1501 | DukeMTMC-reID | ||
---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | |
HA-CNN[ | 75.7 | 91.2 | 63.8 | 80.5 |
PCB[ | 77.3 | 92.4 | 69.2 | 83.3 |
OSNet[ | 84.9 | 94.8 | 73.5 | 88.6 |
APR[ | 66.9 | 87.0 | 55.5 | 73.9 |
ACRN[ | 62.6 | 83.4 | 52.0 | 72.6 |
AANet[ | 82.5 | 93.9 | 72.6 | 86.4 |
SGGNN[ | 82.8 | 92.3 | 68.2 | 81.1 |
MGAT[ | 76.5 | 91.5 | — | — |
HOReID[ | 84.9 | 94.2 | 75.6 | 86.9 |
CLFA[ | 85.9 | 94.5 | 76.4 | 86.4 |
DCC[ | 88.6 | — | — | — |
Base-CNN(本文方法) | 85.6 | 94.2 | 76.2 | 86.3 |
MMGCN(本文方法) | 87.6 | 95.1 | 77.3 | 88.4 |
表1 Market-1501与DukeMTMC-reID数据集上的实验结果对比 ( %)
Tab. 1 Comparison of experimental results on Market-1501 and DukeMTMC-reID datasets
方法 | Market-1501 | DukeMTMC-reID | ||
---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | |
HA-CNN[ | 75.7 | 91.2 | 63.8 | 80.5 |
PCB[ | 77.3 | 92.4 | 69.2 | 83.3 |
OSNet[ | 84.9 | 94.8 | 73.5 | 88.6 |
APR[ | 66.9 | 87.0 | 55.5 | 73.9 |
ACRN[ | 62.6 | 83.4 | 52.0 | 72.6 |
AANet[ | 82.5 | 93.9 | 72.6 | 86.4 |
SGGNN[ | 82.8 | 92.3 | 68.2 | 81.1 |
MGAT[ | 76.5 | 91.5 | — | — |
HOReID[ | 84.9 | 94.2 | 75.6 | 86.9 |
CLFA[ | 85.9 | 94.5 | 76.4 | 86.4 |
DCC[ | 88.6 | — | — | — |
Base-CNN(本文方法) | 85.6 | 94.2 | 76.2 | 86.3 |
MMGCN(本文方法) | 87.6 | 95.1 | 77.3 | 88.4 |
1 | KHAMIS S, KUO C H, SINGH V K, et al. Joint learning for attribute-consistent person re-identification [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8927. Cham: Springer, 2015: 134-146. |
2 | KÖSTINGER M, HIRZER M, WOHLHART P, et al. Large scale metric learning from equivalence constraints [C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 2288-2295. 10.1109/cvpr.2012.6247939 |
3 | LI W, WANG X G. Locally aligned feature transforms across views [C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 3594-3601. 10.1109/cvpr.2013.461 |
4 | ZHENG W S, GONG S G, XIANG T. Reidentification by relative distance comparison[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(3): 653-668. 10.1109/tpami.2012.138 |
5 | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway: IEEE, 2005: 886-893. 10.1109/cvpr.2005.4 |
6 | LOWE D G. Object recognition from local scale-invariant features [C]// Proceedings of the 7th IEEE International Conference on Computer Vision — Volume 2. Piscataway: IEEE, 1999: 1150-1157. 10.1109/iccv.1999.790410 |
7 | WU Z H, PAN S R, CHEN F W, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24. 10.1109/tnnls.2020.2978386 |
8 | 杨永胜,邓淼磊,李磊,等.基于深度学习的行人重识别综述[J].计算机工程与应用, 2022, 58(9): 51-66. 10.3778/j.issn.1002-8331.2110-0300 |
YANG Y S, DENG M L, LI L, et al. Overview of pedestrian re-identification based on deep learning[J]. Computer Engineering and Applications, 2022, 58(9): 51-66. 10.3778/j.issn.1002-8331.2110-0300 | |
9 | 杨锋,许玉,尹梦晓,等.基于深度学习的行人重识别综述[J].计算机应用, 2020, 40(5): 1243-1252. 10.11772/j.issn.1001-9081.2019091703 |
YANG F, XU Y, YIN M X, et al. Review on deep learning-based pedestrian re-identification[J]. Journal of Computer Applications, 2020, 40(5): 1243-1252. 10.11772/j.issn.1001-9081.2019091703 | |
10 | 罗浩,姜伟,范星,等.基于深度学习的行人重识别研究进展[J].自动化学报, 2019, 45(11): 2032-2049. 10.16383/j.aas.c180154 |
LUO H, JIANG W, FAN X, et al. A survey on deep learning based person re-identification[J]. Acta Automatica Sinica, 2019, 45(11): 2032-2049. 10.16383/j.aas.c180154 | |
11 | LI W, ZHAO R, XIAO T, et al. DeepReID: deep filter pairing neural network for person re-identification [C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 152-159. 10.1109/cvpr.2014.27 |
12 | SUN Y F, ZHENG L, YANG Y, et al. Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline) [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11208. Cham: Springer, 2018: 501-518. |
13 | ZHOU K Y, YANG Y X, CAVALLARO A, et al. Omni-scale feature learning for person re-identification [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 3701-3711. 10.1109/iccv.2019.00380 |
14 | 邓轩,廖开阳,郑元林,等.基于深度多视图特征距离学习的行人重识别[J].计算机应用, 2019, 39(8): 2223-2229. 10.11772/j.issn.1001-9081.2018122505 |
DENG X, LIAO K Y, ZHENG Y L, et al. Person re-identification based on deep multi-view feature distance learning[J]. Journal of Computer Applications, 2019, 39(8): 2223-2229. 10.11772/j.issn.1001-9081.2018122505 | |
15 | LI W, ZHU X T, GONG S G. Harmonious attention network for person re-identification [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2285-2294. 10.1109/cvpr.2018.00243 |
16 | LI S, BAK S, CARR P, et al. Diversity regularized spatiotemporal attention for video-based person re-identification [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 369-378. 10.1109/cvpr.2018.00046 |
17 | LIU H, FENG J S, QI M B, et al. End-to-end comparative attention networks for person re-identification[J]. IEEE Transactions on Image Processing, 2017, 26(7): 3492-3506. 10.1109/tip.2017.2700762 |
18 | LIU X H, ZHAO H Y, TIAN M Q, et al. HydraPlus-Net: attentive deep features for pedestrian analysis [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 350-359. 10.1109/iccv.2017.46 |
19 | 刘紫燕,万培佩.基于注意力机制的行人重识别特征提取方法[J].计算机应用, 2020, 40(3): 672-676. |
LIU Z Y, WAN P P. Pedestrian re-identification feature extraction method based on attention mechanism[J]. Journal of Computer Applications, 2020, 40(3): 672-676. | |
20 | LIN Y T, ZHENG L, ZHENG Z D, et al. Improving person re-identification by attribute and identity learning[J]. Pattern Recognition, 2019, 95: 151-161. 10.1016/j.patcog.2019.06.006 |
21 | SCHUMANN A, STIEFELHAGEN R. Person re-identification by deep learning attribute-complementary information [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2017: 1435-1443. 10.1109/cvprw.2017.186 |
22 | TAY C P, ROY S, YAP K H. AANet: attribute attention network for person re-Identifications [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7127-7136. 10.1109/cvpr.2019.00730 |
23 | SHEN Y T, LI H S, YI S, et al. Person re-identification with deep similarity-guided graph neural network [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11219. Cham: Springer, 2018: 508-526. |
24 | BAO L Q, MA B P, CHANG H, et al. Masked graph attention network for person re-identification [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2019: 1496-1505. 10.1109/cvprw.2019.00191 |
25 | WANG G A, YANG S, LIU H Y, et al. High-order information matters: learning relation and topology for occluded person re-identification [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway: IEEE, 2020: 6448-6457. 10.1109/cvpr42600.2020.00648 |
26 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22) [2022-03-22]. . 10.48550/arXiv.1609.02907 |
27 | CHEN Q Y, ZHANG W, FAN J P. Cluster-level feature alignment for person re-identification[EB/OL]. (2020-08-15) [2022-07-12]. . |
28 | YAO H T, XU C S. Dual cluster contrastive learning for object re-identification[EB/OL]. (2022-04-21) [2022-07-12]. . |
29 | ZHENG L, SHEN L Y, TIAN L, et al. Scalable person re-identification: a benchmark [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1116-1124. 10.1109/iccv.2015.133 |
30 | RISTANI E, SOLERA F, ZOU R, et al. Performance measures and a data set for multi-target, multi-camera tracking [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016: 17-35. |
31 | ROSENBLATT F. The perceptron: a probabilistic model for information storage and organization in the brain[J]. Psychological Review, 1958, 65(6): 386-408. 10.1037/h0042519 |
32 | JIA J, HUANG H J, YANG W J, et al. Rethinking of pedestrian attribute recognition: realistic datasets and a strong baseline[EB/OL]. (2020-05-26) [2022-03-24]. . |
33 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
34 | LI D W, CHEN X T, ZHANG Z, et al. Pose guided deep model for pedestrian attribute recognition in surveillance scenarios [C]// Proceedings of the 2018 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2018: 1-6. 10.1109/icme.2018.8486604 |
35 | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems — Volume 2. Red Hook, NY: Curran Associates Inc., 2013: 3111-3119. |
36 | PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1532-1543. 10.3115/v1/d14-1162 |
37 | ZHANG G H, LIANG G Y, SU F, et al. Cross-domain attribute representation based on convolutional neural network [C]// Proceedings of the 2018 International Conference on Intelligent Computing, LNCS 10956. Cham: Springer, 2018: 134-142. |
38 | XU S M, LUO L K, HU S Q. Attention-based model with attribute classification for cross-domain person re-identification [C]// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway: IEEE, 2021: 9149-9155. 10.1109/icpr48806.2021.9413309 |
39 | XU B L, LIU J X, HOU X X, et al. Cross domain person re-identification with large scale attribute annotated datasets[J]. IEEE Access, 2019, 7: 21623-21634. 10.1109/access.2019.2896663 |
40 | XIAO Q Q, CAO K L, CHEN H N, et al. Cross domain knowledge transfer for person re-identification[EB/OL]. (2016-11-18) [2022-03-25]. . |
41 | LUO H, GU Y Z, LIAO X Y, et al. Bag of tricks and a strong baseline for deep person re-identification [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2019: 1487-1495. 10.1109/cvprw.2019.00190 |
42 | WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham: Springer, 2016: 499-515. |
43 | LI X Y, JIANG S Q. Know more say less: image captioning based on scene graphs[J]. IEEE Transactions on Multimedia, 2019, 21(8): 2117-2130. 10.1109/tmm.2019.2896516 |
[1] | 张睿, 张鹏云, 高美蓉. 自优化双模态多通路非深度前庭神经鞘瘤识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2975-2982. |
[2] | 黄颖, 杨佳宇, 金家昊, 万邦睿. 用于RGBT跟踪的孪生混合信息融合算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2878-2885. |
[3] | 贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902. |
[4] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
[5] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[6] | 王翠, 邓淼磊, 张德贤, 李磊, 杨晓艳. 基于图像的端到端行人搜索算法综述[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2544-2550. |
[7] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. |
[8] | 沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806. |
[9] | 陈田, 蔡从虎, 袁晓辉, 罗蓓蓓. 基于多尺度卷积和自注意力特征融合的多模态情感识别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 369-376. |
[10] | 赖华, 孙童, 王文君, 余正涛, 高盛祥, 董凌. 多模态特征的越南语语音识别文本标点恢复[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 418-423. |
[11] | 王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368. |
[12] | 郑盛有, 陈雁翔, 赵祖兴, 刘海洋. 多模态部分伪造数据集的构建与基准检测[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3134-3140. |
[13] | 林于翔, 吴运兵, 阴爱英, 廖祥文. 基于语义相关性分析的多模态摘要模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 65-72. |
[14] | 罗俊豪, 朱焱. 用于未对齐多模态语言序列情感分析的多交互感知网络[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 79-85. |
[15] | 李牧, 杨宇恒, 柯熙政. 基于混合特征提取与跨模态特征预测融合的情感识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 86-93. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||