Review on deep learning-based pedestrian re-identification
YANG Feng1,2, XU Yu1, YIN Mengxiao1,2, FU Jiacheng1, HUANG Bing1, LIANG Fangxuan1
1.School of Computer, Electronics and Information, Guangxi University, NanningGuangxi 530004, China 2.Guangxi Key Laboratory of Multimedia Communications Network Technology (Guangxi University), NanningGuangxi 530004, China
Abstract:Pedestrian Re-IDentification (Re-ID) is a hot issue in the field of computer vision and mainly focuses on “how to relate to specific person captured by different cameras in different physical locations”. Traditional methods of Re-ID were mainly based on the extraction of low-level features, such as local descriptors, color histograms and human poses. In recent years, in view of the problems in traditional methods such as pedestrian occlusion and posture disalignment, pedestrian Re-ID methods based on deep learning such as region, attention mechanism, posture and Generative Adversarial Network (GAN) were proposed and the experimental results became significantly better than before. Therefore, the researches of deep learning in pedestrian Re-ID were summarized and classified, and different from the previous reviews, the pedestrian Re-ID methods were divided into four categories to discuss in this review. Firstly, the pedestrian Re-ID methods based on deep learning were summarized by following four methods region, attention, posture, and GAN. Then the performances of mAP (mean Average Precision) and Rank-1 indicators of these methods on the mainstream datasets were analyzed. The results show that the deep learning-based methods can reduce the model overfitting by enhancing the connection between local features and narrowing domain gaps. Finally, the development direction of pedestrian Re-ID method research was forecasted.
1 PLANTINGA A . Things and persons[J]. Review of Metaphysics, 1961, 14(3):493-519. 2 ZHENG L , YANG Y , HAUPTMANN A G . Person reidentification: past, present and future[EB/OL]. [2018-10-10].https://arxiv.org/pdf/1610.02984.pdf. 3 HUANG T , RUSSELL S . Object identification in a Bayesian context[C]// Proceedings of the 15th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann, 1997: 1276-1282. 4 HAMEETE P , LEYSEN S , LAAN T VAN DER , et al . Intelligent multi-camera video surveillance[J]. International Journal on Information Technologies and Security, 2012, 4(4):51-62. 5 ZAJDEL W , ZIVKOVIC Z , KROSE B J A . Keeping track of humans: have I seen this person before?[C]// Proceedings of the 2005 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2005: 2081-2086. 6 SUN Y , CHEN Y , WANG X , et al . Deep learning face representation by joint identification-verification[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 1988-1996. 7 KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. New York: Curran Associates Inc., 2012: 1097-1105. 8 YI D , LEI Z , LIAO S , et al . Deep metric learning for person re-identification[C]// Proceedings of the 22nd International Conference on Pattern Recognition. Piscataway: IEEE, 2014:34-39. 9 AHMED E , JONES M , MARKS T K . An improved deep learning architecture for person re-identification[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015:3908-3916. 10 YANG Y , YANG J , YAN J , et al . Salient color names for person re-identification[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8689. Cham: Springer, 2014: 536-551. 11 ZHAO R , OUYANG W , WANG X . Learning mid-level filters for person re-identification[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014:144-151. 12 ZHANG Z , CHEN Y , SALIGRAMA V . A novel visual word co-occurrence model for person re-identification[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8927. Cham: Springer, 2014:122-133. 13 KÖSTINGER M , HIRZER M , WOHLHART P , et al . Large scale metric learning from equivalence constraints[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012:2288-2295. 14 LI W , WANG X . Locally aligned feature transforms across views[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 3594-3601. 15 ZHENG Z , ZHENG L , YANG Y . Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017:3754-3762. 16 LI Z , CHANG S , LIANG F , et al . Learning locally-adaptive decision functions for person verification[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013:3610-3617. 17 ZHAO R , OUYANG W , WANG X . Person re-identification by salience matching[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 2528-2535. 18 LI W , ZHAO R , XIAO T , et al . DeepReID: deep filter pairing neural network for person re-identification[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014:152-159. 19 KHAMIS S , KUO C H, SINGH V K , et al . Joint learning for attribute-consistent person re-identification[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8927. Cham: Springer, 2014: 134-146. 20 XIONG F , GOU M , CAMPS O , et al . Person re-identification using kernel-based metric learning methods[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8695. Cham: Springer, 2014:1-16. 21 FARENZENA M , BAZZANI L , PERINA A , et al . Person re-identification by symmetry-driven accumulation of local features[C]// Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2010:2360-2367. 22 HAMDOUN O , MOUTARDE F , STANCIULESCU B , et al . Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences[C]// Proceedings of the 2nd ACM/IEEE International Conference on Distributed Smart Cameras. Piscataway: IEEE, 2008:1-6. 23 MATSUKAWA T , SUZUKI E . Person re-identification using CNN features learned from combination of attributes[C]// Proceedings of the 23rd International Conference on Pattern Recognition. Piscataway: IEEE, 2017:2428-2433. 24 VARIOR R R , HALOI M , WANG G . Gated Siamese convolutional neural network architecture for human re-identification[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9912. Cham: Springer, 2016: 791-808. 25 CHENG D , GONG Y , ZHOU S , et al . Person re-identification by multi-channel parts-based CNN with improved triplet loss function[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:1335-1344. 26 LIN Y , ZHENG L , ZHENG Z , et al . Improving person re-identification by attribute and identity learning[J]. Pattern Recognition, 2019, 95: 151-161. 27 DALAL N , TRIGGS B . Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2005:886-893. 28 郑伟诗,吴岸聪 . 非对称行人重识别:跨摄像机持续行人追踪[J]. 中国科学:信息科学, 2018, 48(5):545-563. (ZHENG W S, WU A C. Asymmetric person re-identification: cross-view person tracking in a large camera network[J]. SCIENTIA SINICA Informationis, 2018, 48(5): 545-563.) 29 CHEN Y , ZHENG W , LAI J . Mirror representation for modeling view-specific transform in person re-identification[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 3402-3408. 30 SU C , YANG F , ZHANG S , et al . Multi-task learning with low rank attribute embedding for person re-identification[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 3739-3747. 31 SU C , ZHANG S , XING J , et al . Deep attributes driven multi-camera person re-identification[C]// Proceedings of the 2016 European Conference on Computer Vision. Cham: Springer, 2016: 475-491. 32 苏松志,李绍滋,陈淑媛,等 . 行人检测技术综述[J]. 电子学报, 2012, 40(4):814-820. SU S Z , LI S Z , CHEN S Y , et al . A survey on pedestrian detection[J]. Acta Electronica Sinica, 2012, 40(4): 814-820. 33 PROSSER B J , ZHENG W , GONG S , et al . Person re-identification by support vector ranking[C]// Proceedings of the 2010 British Machine Vision Conference. Durham: BMVA, 2010: No.21. 34 PEDAGADI S , ORWELL J , VELASTIN S , et al . Local Fisher discriminant analysis for pedestrian re-identification[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 3318-3325. 35 MIGNON A , JURIE F . PCCA: a new approach for distance learning from sparse pairwise constraints[C]// Proceedings of the 2012 Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012:2666-2672. 36 XU Y , LIN L , ZHENG W , et al . Human re-identification by matching compositional template with cluster sampling[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 3152-3159. 37 ZHAO R , OUYANG W , WANG X . Unsupervised salience learning for person re-identification[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 3586-3593. 38 SUN Y , XU Q , LI Y , et al . Perceive where to focus: learning visibility-aware part-level features for partial person re-identification[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 393-402. 39 SUN Y , ZHENG L , YANG Y , et al . Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline)[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11208 . Cham: Springer, 2018: 501-518. 40 LIAO S , HU Y , ZHU X , et al . Person re-identification by local maximal occurrence representation and metric learning[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 2197-2206. 41 MA A J , YUEN P C , LI J . Domain transfer support vector ranking for person re-identification without target camera label information[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 3567-3574. 42 WANG G , YUAN Y , CHEN X , et al . Learning discriminative features with multiple granularities for person re-identification[C]// Proceedings of the 2018 ACM Multimedia Conference. New York:ACM,2018:274-282. 43 GRAY D , TAO H . Viewpoint invariant pedestrian recognition with an ensemble of localized features[C]// Proceedings of the 2008 European Conference on Computer Vision, LNCS 5302. Berlin: Springer, 2008: 262-275. 44 ZHENG W , GONG S , XIANG T . Reidentification by relative distance comparison[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(3):653-668. 45 TAO D , JIN L , WANG Y , et al . Person re-identification by regularized smoothing KISS metric learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(10): 1675-1685. 46 TAO D , JIN L , WANG Y , et al . Person reidentification by minimum classification error-based KISS metric learning[J]. IEEE Transactions on Cybernetics, 2015, 45(2): 242-252. 47 PORIKLI F . Inter-camera color calibration by correlation model function[C]// Proceedings of the 2003 International Conference on Image Processing. Piscataway: IEEE, 2003:II-133. 48 PROSSER B , GONG S , XIANG T . Multi-camera matching using bi-directional cumulative brightness transfer functions[C]// Proceedings of the 2008 British Machine Vision Conference. Durham: BMVA, 2008: No.64. 49 LI W , ZHU X , GONG S . Harmonious attention network for person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2285-2294. 50 RISTANI E , SOLERA F , ZOU R , et al. Performance measures and a data set for multi-target , multi-camera tracking[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016:17-35. 51 ZHAO H , TIAN M , SUN S , et al . Spindle Net: person re-identification with human body region guided feature decomposition and fusion[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:907-915. 52 ZHENG L , HUANG Y , LU H , et al . Pose invariant embedding for deep person re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(9):4500-4509. 53 WEI L , ZHANG S , YAO H , et al . GLAD: global-local-alignment descriptor for pedestrian retrieval[C]// Proceedings of the 25th ACM Multimedia Conference. New York: ACM, 2017: 420-428. 54 YAO H , ZHANG S , HONG R , et al . Deep representation learning with part loss for person re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(6): 2860-2871. 55 ZHANG X , LUO H , FAN X , et al . AlignedReID: surpassing human-level performance in person re-identification[EB/OL]. [2018-11-08]. https://arxiv.org/pdf/1711.08184.pdf. 56 HE L , LIANG J , LI H , et al . Deep spatial feature reconstruction for partial person re-identification: alignment-free approach[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7073-7082. 57 WU L , SHEN C , HENGEL A VAN DEN . PersonNet: person re-identification with deep convolutional neural networks[EB/OL]. [2019-01-10].https://arxiv.org/pdf/1601.07255.pdf. 58 ZHANG Y , LI X , ZHAO L , et al . Semantics-aware deep correspondence structure learning for robust person re-identification[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2016:3545-3551. 59 MATSUKAWA T , OKABE T , SUZUKI E , et al . Hierarchical gaussian descriptor for person re-identification[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1363-1372. 60 VARIOR R R , SHUAI B , LU J , et al . A Siamese long short-term memory architecture for human re-identification[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham: Springer, 2016:135-153. 61 ZHAO L , LI X , ZHUANG Y , et al . Deeply-learned part-aligned representations for person re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017:3239-3248. 62 LIU X , ZHAO H , TIAN M , et al . HydraPlus-Net: attentive deep features for pedestrian analysis[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017:350-359. 63 SHEN Y , LIN W , YAN J , et al . Person re-identification with correspondence structure learning[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 3200-3208. 64 ZHENG W , LI X , XIANG T , et al . Partial person re-identification[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015:4678-4686. 65 WANG H , GONG S , XIANG T . Unsupervised learning of generative topic saliency for person re-identification[C]// Proceedings of the 2014 British Machine Vision Conference. Durham: BMVA, 2014: No.19. 66 XU J , ZHAO R , ZHU F , et al . Attention-aware compositional network for person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2119-2128. 67 LI S , BAK S, CARR P , et al . Diversity regularized spatiotemporal attention for video-based person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 369-378. 68 CHEN D , LI H , XIAO T , et al . Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1169-1178. 69 SU C , LI J , ZHANG S , et al . Pose-driven deep convolutional model for person re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3980-3989. 70 LIU J , NI B , YAN Y , et al . Pose transferrable person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4099-4108. 71 SARFRAZ M S , SCHUMANN A , EBERLE A , et al . A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 420-429. 72 MA A J , LI P . Query based adaptive re-ranking for person re-identification[C]// Proceedings of the 2014 Asian Conference on Computer Vision, LNCS 9007. Cham: Springer, 2014: 397-412. 73 GARCíA J , MARTINEL N , MICHELONI C , et al . Person re-identification ranking optimisation by discriminant context information analysis[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015:1305-1313. 74 LENG Q , HU R , LIANG C , et al . Person re-identification with content and context re-ranking[J]. Multimedia Tools and Applications, 2015, 74(17): 6989-7014. 75 YE M , LIANG C , YU Y , et al . Person reidentification via ranking aggregation of similarity pulling and dissimilarity pushing[J]. IEEE Transactions on Multimedia, 2016, 18(12): 2553-2566. 76 ZHONG Z , ZHENG L , CAO D , et al . Re-ranking person re-identification with k-reciprocal encoding[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:3652-3661. 77 ZHONG Z , ZHENG L , ZHENG Z , et al . Camera style adaptation for person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 5157-5166. 78 ZHU J Y , PARK T , ISOLA P , et al . Unpaired image-to-image translation using cycle-consistent adversarial networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2242-2251. 79 ZHU Z , HUANG T , SHI B , et al . Progressive pose attention transfer for person image generation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 2342-2351. 80 SONG S , ZHANG W , LIU J , et al . Unsupervised person image generation with semantic parsing transformation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 2352-2361. 81 DENG W , ZHENG L , YE Q , et al . Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 994-1003. 82 WEI L , ZHANG S , GAO W , et al . Person transfer GAN to bridge domain gap for person re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 79-88. 83 LI W , ZHAO R , WANG X . Human reidentification with transferred metric learning[C]// Proceedings of the 2012 Asian Conference on Computer Vision, LNCS 7724. Berlin: Springer, 2012:31-44. 84 GRAY D , BRENNAN S , TAO H . Evaluating appearance models for recognition, reacquisition , and tracking[C]// Proceedings of the 2007 IEEE International Workshop on Performance Evaluation for Tracking and Surveillance. Piscataway: IEEE, 2007: 1-7. 85 EVERINGHAM M , WINN J . The PASCAL Visual Object Classes challenge 2012 VOC2012) development kit[EB/OL]. [2019-03-20]. http://host.robots.ox.ac.uk/pascal/VOC/voc2012/devkit_doc.pdf. 86 EVERINGHAM M , GOOL L VAN , WILLIAMS C K I , et al . The PASCAL Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338. 87 PUMAROLA A , AGUDO A , SANFELIU A , et al . Unsupervised person image synthesis in arbitrary poses[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8620-8628.