Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 728-735.DOI: 10.11772/j.issn.1001-9081.2022010034
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Received:
2022-01-13
Revised:
2022-03-10
Accepted:
2022-03-14
Online:
2022-05-31
Published:
2023-03-10
Contact:
Xiaoyan JIANG
About author:
YAO Yingmao, born in 1997, M. S. candidate. His research interests include person re-identification.Supported by:
通讯作者:
姜晓燕
作者简介:
姚英茂(1997—),男,河南孟州人,硕士研究生,主要研究方向:行人重识别基金资助:
CLC Number:
Yingmao YAO, Xiaoyan JIANG. Video-based person re-identification method based on graph convolution network and self-attention graph pooling[J]. Journal of Computer Applications, 2023, 43(3): 728-735.
姚英茂, 姜晓燕. 基于图卷积网络与自注意力图池化的视频行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 728-735.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022010034
方法 | MARS | DukeMTMC-VideoReID | ||||||
---|---|---|---|---|---|---|---|---|
mAP | R1 | R5 | R20 | mAP | R1 | R5 | R20 | |
CNN+CQDA | 47.6 | 65.3 | 82.0 | 89.0 | — | — | — | — |
TAM+ SRM | 50.7 | 70.6 | 90.0 | 97.6 | — | — | — | — |
SSA+ CASE | 76.1 | 86.3 | 94.7 | 98.2 | — | — | — | — |
3DCNN+NLA | 79.5 | 88.6 | 96.4 | 98.8 | 93.7 | 95.5 | 99.7 | |
COSAM | 79.9 | 84.9 | 95.5 | 97.9 | 94.1 | 95.4 | ||
STA | 80.8 | 86.3 | 95.7 | 98.1 | 94.6 | 99.6 | ||
MA | 80.9 | 87.3 | — | — | 94.8 | 96.7 | — | — |
STE- NVAN | 81.2 | 88.9 | — | — | 93.5 | 95.2 | — | — |
VKD | 83.1 | 96.8 | — | 93.5 | 95.2 | 98.6 | — | |
AITL | 88.2 | 96.5 | 95.4 | 99.6 | 99.9 | |||
本文 方法 | 85.7 | 90.2 | 98.1 | 95.8 | 96.7 | 99.9 |
Tab. 1 Comparison of different methods
方法 | MARS | DukeMTMC-VideoReID | ||||||
---|---|---|---|---|---|---|---|---|
mAP | R1 | R5 | R20 | mAP | R1 | R5 | R20 | |
CNN+CQDA | 47.6 | 65.3 | 82.0 | 89.0 | — | — | — | — |
TAM+ SRM | 50.7 | 70.6 | 90.0 | 97.6 | — | — | — | — |
SSA+ CASE | 76.1 | 86.3 | 94.7 | 98.2 | — | — | — | — |
3DCNN+NLA | 79.5 | 88.6 | 96.4 | 98.8 | 93.7 | 95.5 | 99.7 | |
COSAM | 79.9 | 84.9 | 95.5 | 97.9 | 94.1 | 95.4 | ||
STA | 80.8 | 86.3 | 95.7 | 98.1 | 94.6 | 99.6 | ||
MA | 80.9 | 87.3 | — | — | 94.8 | 96.7 | — | — |
STE- NVAN | 81.2 | 88.9 | — | — | 93.5 | 95.2 | — | — |
VKD | 83.1 | 96.8 | — | 93.5 | 95.2 | 98.6 | — | |
AITL | 88.2 | 96.5 | 95.4 | 99.6 | 99.9 | |||
本文 方法 | 85.7 | 90.2 | 98.1 | 95.8 | 96.7 | 99.9 |
模型 | mAP | R1 | R5 | R20 |
---|---|---|---|---|
Baseline | 84.2 | 88.7 | 96.0 | 97.7 |
Baseline+GCN | 85.3 | 88.7 | 96.6 | 98.3 |
Baseline+GCN+SAGP | 85.4 | 89.2 | 96.4 | 98.2 |
Baseline+CL+OCL | 85.1 | 88.9 | 96.7 | 98.3 |
Baseline+GCN+SAGP+CL+OCL | 85.7 | 90.2 | 96.7 | 98.1 |
Tab. 2 Ablation experimental results on MARS dataset
模型 | mAP | R1 | R5 | R20 |
---|---|---|---|---|
Baseline | 84.2 | 88.7 | 96.0 | 97.7 |
Baseline+GCN | 85.3 | 88.7 | 96.6 | 98.3 |
Baseline+GCN+SAGP | 85.4 | 89.2 | 96.4 | 98.2 |
Baseline+CL+OCL | 85.1 | 88.9 | 96.7 | 98.3 |
Baseline+GCN+SAGP+CL+OCL | 85.7 | 90.2 | 96.7 | 98.1 |
切分块数 | mAP | R1 | R5 | R20 |
---|---|---|---|---|
2 | 84.8 | 88.8 | 96.1 | 98.1 |
4 | 85.7 | 90.2 | 96.7 | 98.1 |
8 | 85.2 | 89.0 | 96.1 | 98.4 |
Tab. 3 Comparison of feature segmentation strategies
切分块数 | mAP | R1 | R5 | R20 |
---|---|---|---|---|
2 | 84.8 | 88.8 | 96.1 | 98.1 |
4 | 85.7 | 90.2 | 96.7 | 98.1 |
8 | 85.2 | 89.0 | 96.1 | 98.4 |
r | mAP | R1 | R5 | R20 |
---|---|---|---|---|
10 | 85.3 | 89.5 | 96.2 | 98.1 |
20 | 85.1 | 89.2 | 96.2 | 98.1 |
25 | 85.7 | 90.2 | 96.7 | 98.1 |
30 | 85.3 | 89.6 | 96.5 | 98.1 |
50 | 85.2 | 89.1 | 96.2 | 98.0 |
75 | 85.2 | 89.3 | 96.3 | 98.2 |
90 | 85.3 | 89.0 | 96.6 | 98.3 |
Tab. 4 Comparative experimental results of graph pooling ratio
r | mAP | R1 | R5 | R20 |
---|---|---|---|---|
10 | 85.3 | 89.5 | 96.2 | 98.1 |
20 | 85.1 | 89.2 | 96.2 | 98.1 |
25 | 85.7 | 90.2 | 96.7 | 98.1 |
30 | 85.3 | 89.6 | 96.5 | 98.1 |
50 | 85.2 | 89.1 | 96.2 | 98.0 |
75 | 85.2 | 89.3 | 96.3 | 98.2 |
90 | 85.3 | 89.0 | 96.6 | 98.3 |
λ | mAP | R1 | R5 | R20 |
---|---|---|---|---|
10 | 85.1 | 89.2 | 96.3 | 98.2 |
30 | 85.5 | 88.8 | 96.2 | 98.1 |
40 | 85.3 | 89.0 | 96.5 | 98.1 |
50 | 85.7 | 90.2 | 96.7 | 98.1 |
60 | 84.7 | 89.1 | 96.1 | 98.2 |
70 | 85.2 | 89.4 | 96.1 | 97.9 |
90 | 84.4 | 89.1 | 96.4 | 98.1 |
Tab. 5 Comparison on weighting parameters of loss function
λ | mAP | R1 | R5 | R20 |
---|---|---|---|---|
10 | 85.1 | 89.2 | 96.3 | 98.2 |
30 | 85.5 | 88.8 | 96.2 | 98.1 |
40 | 85.3 | 89.0 | 96.5 | 98.1 |
50 | 85.7 | 90.2 | 96.7 | 98.1 |
60 | 84.7 | 89.1 | 96.1 | 98.2 |
70 | 85.2 | 89.4 | 96.1 | 97.9 |
90 | 84.4 | 89.1 | 96.4 | 98.1 |
1 | 叶钰, 王正, 梁超, 等. 多源数据行人重识别研究综述[J]. 自动化学报, 2020, 46(9): 1869-1884. 10.16383/j.aas.c190278 |
YE Y, WANG Z, LIANG C, et al. A survey on multi-source person re-identification[J]. Acta Automatica Sinica, 2020, 46(9): 1869-1884. 10.16383/j.aas.c190278 | |
2 | 韩建栋, 李晓宇. 基于多尺度特征融合的行人重识别方法[J]. 计算机应用, 2021, 41(10): 2991-2996. 10.11772/j.issn.1001-9081.2020121908 |
HAN J D, LI X Y. Pedestrian re-identification method based on multi-scale feature fusion[J]. Journal of Computer Applications, 2021, 41(10): 2991-2996. 10.11772/j.issn.1001-9081.2020121908 | |
3 | CHUNG D, TAHBOUB K, DELP E J. A two stream siamese convolutional neural network for person re-identification [C]// Proceedings of the 16th IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1983-1991. 10.1109/iccv.2017.218 |
4 | ZHOU Z, HUANG Y, WANG W, et al. See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4747-4756. 10.1109/cvpr.2017.717 |
5 | LIU Y, YUAN Z, ZHOU W, et al. Spatial and temporal mutual promotion for video-based person re-identification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019, 33(1): 8786-8793. 10.1609/aaai.v33i01.33018786 |
6 | LI J, ZHANG S, HUANG T. Multi-scale temporal cues learning for video person re-identification[J]. IEEE Transactions on Image Processing, 2020, 29: 4461-4473. 10.1109/tip.2020.2972108 |
7 | LIAO X, HE L, YANG Z, et al. Video-based person re-identification via 3d convolutional networks and non-local attention[C]// Proceedings of the 14th Asian Conference on Computer Vision, LNCS 11366. Cham: Springer, 2019: 620-634. 10.1007/978-3-030-20876-9_39 |
8 | FU Y, WANG X, WEI Y, et al. STA: spatial-temporal attention for large-scale video-based person re-identification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019, 33(1): 8287-8294. 10.1609/aaai.v33i01.33018287 |
9 | LIU C T, WU C W, WANG Y C F, et al. Spatially and temporally efficient non-local attention network for video-based person re-identification[C]// Proceedings of the 2019 British Machine Vision Conference. Durham: BMVA Press, 2019: No.77. 10.1145/3377170.3377253 |
10 | CHEN D, LI H, XIAO T, et al. Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1169-1178. 10.1109/cvpr.2018.00128 |
11 | SUBRAMANIAM A, NAMBIAR A, MITTAL A. Co-segmentation inspired attention networks for video-based person re-identification[C]// Proceedings of the 17th IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 562-572. 10.1109/iccv.2019.00065 |
12 | WU Y, BOURAHLA O E F, LI X, et al. Adaptive graph representation learning for video person re-identification[J]. IEEE Transactions on Image Processing, 2020, 29: 8821-8830. 10.1109/tip.2020.3001693 |
13 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [EB/OL]. (2017-02-22) [2021-08-13]. . 10.48550/arXiv.1609.02907 |
14 | LEE J, LEE I, KANG J. Self-attention graph pooling[C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 3734-3743. |
15 | WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition[C]// Proceedings of the 14th European Conference on Computer Vision, LNCS 9911. Cham: Springer, 2016: 499-515. |
16 | WANG X, HUA Y, KODIROV E, et al. Deep metric learning by online soft mining and class-aware attention[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019, 33(1): 5361-5368. 10.1609/aaai.v33i01.33015361 |
17 | SCARSELLI F, GORI M, TSOI A C, et al. The graph neural network model[J]. IEEE Transactions on Neural Networks, 2009, 20(1): 61-80. 10.1109/tnn.2008.2005605 |
18 | CHEN L, ZHANG H, XIAO J, et al. Counterfactual critic multi-agent training for scene graph generation[C]// Proceedings of the 17th IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 4613-4623. 10.1109/iccv.2019.00471 |
19 | WANG Y, SUN Y, LIU Z, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 2019, 38(5): No.146. 10.1145/3326362 |
20 | LIU Z, ZHANG H, CHEN Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 143-152. 10.1109/cvpr42600.2020.00022 |
21 | BAO L, MA B, CHANG H, et al. Masked graph attention network for person re-identification[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2019: 1496-1505. 10.1109/cvprw.2019.00191 |
22 | YANG J, ZHENG W S, YANG Q, et al. Spatial-temporal graph convolutional network for video-based person re-identification[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 3289-3299. 10.1109/cvpr42600.2020.00335 |
23 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
24 | LI S, BAK S, CARR P, et al. Diversity regularized spatiotemporal attention for video-based person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 369-378. 10.1109/cvpr.2018.00046 |
25 | SUN Y, ZHENG L, YANG Y, et al. Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline)[C]// Proceedings of the 15th European Conference on Computer Vision, LNCS 11208. Cham: Springer, 2018: 480-496. |
26 | GAO H, JI S. Graph U-nets[C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 2083-2092. |
27 | HERMANS A, BEYER L, LEIBE B. In defense of the triplet loss for person re-identification [EB/OL]. (2017-11-21) [2021-10-21]. . 10.21203/rs.3.rs-1501673/v1 |
28 | ZHENG L, BIE Z, SUN Y, et al. MARS: a video benchmark for large-scale person re-identification[C]// Proceedings of the 14th European Conference on Computer Vision, LNCS 9910. Cham: Springer, 2016: 868-884. |
29 | RISTANI E, SOLERA F, ZOU R, et al. Performance measures and a data set for multi-target, multi-camera tracking[C]// Proceedings of the 14th European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016: 17-35. |
30 | KIRAN M, BHUIYAN A, BLAIS-MORIN L A, et al. Flow guided mutual attention for person re-identification[J]. Image and Vision Computing, 2021, 113: 104246. 10.1016/j.imavis.2021.104246 |
31 | PORRELLO A, BERGAMINI L, CALDERARA S. Robust re-identification by multiple views knowledge distillation[C]// Proceedings of the 16th European Conference on Computer Vision, LNCS 12355. Cham: Springer, 2020: 93-110. |
32 | CHEN Z, LI A, JIANG S, et al. Attribute-aware identity-hard triplet loss for video-based person re-identification [EB/OL]. [2021-07-24]. . 10.3390/app10062198 |
33 | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// Proceedings of the 16th IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 618-626. 10.1109/iccv.2017.74 |
[1] | Chuanlin PANG, Rui TANG, Ruizhi ZHANG, Chuan LIU, Jia LIU, Shibo YUE. Distributed power allocation algorithm based on graph convolutional network for D2D communication systems [J]. Journal of Computer Applications, 2024, 44(9): 2855-2862. |
[2] | Guixiang XUE, Hui WANG, Weifeng ZHOU, Yu LIU, Yan LI. Port traffic flow prediction based on knowledge graph and spatio-temporal diffusion graph convolutional network [J]. Journal of Computer Applications, 2024, 44(9): 2952-2957. |
[3] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[4] | Shibin LI, Jun GONG, Shengjun TANG. Semi-supervised heterophilic graph representation learning model based on Graph Transformer [J]. Journal of Computer Applications, 2024, 44(6): 1816-1823. |
[5] | Zimeng ZHU, Zhixin LI, Zhan HUAN, Ying CHEN, Jiuzhen LIANG. Weakly supervised video anomaly detection based on triplet-centered guidance [J]. Journal of Computer Applications, 2024, 44(5): 1452-1457. |
[6] | Longtao GAO, Nana LI. Aspect sentiment triplet extraction based on aspect-aware attention enhancement [J]. Journal of Computer Applications, 2024, 44(4): 1049-1057. |
[7] | Xianfeng YANG, Yilei TANG, Ziqiang LI. Aspect-level sentiment analysis model based on alternating‑attention mechanism and graph convolutional network [J]. Journal of Computer Applications, 2024, 44(4): 1058-1064. |
[8] | Kaitian WANG, Qing YE, Chunlei CHENG. Classification method for traditional Chinese medicine electronic medical records based on heterogeneous graph representation [J]. Journal of Computer Applications, 2024, 44(2): 411-417. |
[9] | Zucheng WU, Xiaojun WU, Tianyang XU. Image-text retrieval model based on intra-modal fine-grained feature relationship extraction [J]. Journal of Computer Applications, 2024, 44(12): 3776-3783. |
[10] | Xinrong HU, Jingxue CHEN, Zijian HUANG, Bangchao WANG, Xun YAO, Junping LIU, Qiang ZHU, Jie YANG. Graph convolution network-based masked data augmentation [J]. Journal of Computer Applications, 2024, 44(11): 3335-3344. |
[11] | Nengqiang XIANG, Xiaofei ZHU, Zhaoze GAO. Information diffusion prediction model of prototype-aware dual-channel graph convolutional neural network [J]. Journal of Computer Applications, 2024, 44(10): 3260-3266. |
[12] | Yanbo LI, Qing HE, Shunyi LU. Aspect sentiment triplet extraction integrating semantic and syntactic information [J]. Journal of Computer Applications, 2024, 44(10): 3275-3280. |
[13] | Wanting JI, Wenyi LU, Yuhang MA, Linlin DING, Baoyan SONG, Haolin ZHANG. Machine reading comprehension event detection based on relation-enhanced graph convolutional network [J]. Journal of Computer Applications, 2024, 44(10): 3288-3293. |
[14] | Hanxiao SHI, Leichun WANG. Short-term power load forecasting by graph convolutional network combining LSTM and self-attention mechanism [J]. Journal of Computer Applications, 2024, 44(1): 311-317. |
[15] | Yi ZHANG, Gangsheng CAI, Zhenmei WANG. Long non-coding RNA-disease association prediction model based on semantic and global dual attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2125-2132. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||