Pedestrian re-identification method based on multi-scale feature fusion

doi:10.11772/j.issn.1001-9081.2020121908

Abstract

Abstract: Pedestrian re-identification tasks lack the consideration of the pedestrian feature scale variation during feature extraction, so that they are easily affected by environment and have low accuracy of pedestrian re-identification. In order to solve the problem, a pedestrian re-identification method based on multi-scale feature fusion was proposed. Firstly, in the shallow layer of the network, multi-scale pedestrian features were extracted through mixed pooling operation, which was helpful to improve the feature extraction capability of the network. Then, strip pooling operation was added to the residual block to extract the remote context information in horizontal and vertical directions respectively, which avoided the interference of irrelevant regions. Finally, after the residual network, the dilated convolutions with different scales were used to further preserve the multi-scale features, so as to help the model to analyze the scene structure flexibly and effectively. Experimental results show that, on Market-1501 dataset, the proposed method has the Rank1 of 95.9%, and the mean Average Precision (mAP) of 88.5%; on DukeMTMC-reID dataset, the proposed method has the Rank1 of 90.1%, and the mAP of 80.3%. It can be seen that the proposed method can retain the pedestrian feature information better, thereby improving the accuracy of pedestrian re-identification tasks.

Key words: pedestrian re-identification, multi-scale feature, remote context information, dilated convolution, feature fusion

摘要： 针对行人重识别任务在特征提取时缺乏对行人特征尺度变化的考虑，导致其易受环境影响而具有低行人重识别准确率的问题，提出了一种基于多尺度特征融合的行人重识别方法。首先，在网络浅层通过混合池化操作来提取多尺度的行人特征，从而帮助网络提升特征提取能力；然后，在残差块内添加条形池化操作以分别提取水平和竖直方向的远程上下文信息，从而避免无关区域的干扰；最后，在残差网络之后利用不同尺度的空洞卷积进一步保留多尺度的特征，从而帮助模型灵活有效地解析场景结构。实验结果表明，在Market-1501数据集上，所提方法的Rank1达到95.9%，平均精度均值（mAP）为88.5%；在DukeMTMC-reID数据集上，该方法的Rank1达到90.1%，mAP为80.3%。可见所提方法能够较好地保留行人特征信息，从而提高行人重识别任务准确率。

关键词: 行人重识别, 多尺度特征, 远程上下文信息, 空洞卷积, 特征融合

CLC Number:

TP391.4

HAN Jiandong, LI Xiaoyu. Pedestrian re-identification method based on multi-scale feature fusion[J]. Journal of Computer Applications, 2021, 41(10): 2991-2996.

韩建栋, 李晓宇. 基于多尺度特征融合的行人重识别方法[J]. 计算机应用, 2021, 41(10): 2991-2996.

References

[1] ZAJDEL W, ZIVKOVIC Z, KROSE B J A. Keeping track of humans:Have I seen this person before?[C]//Proceedings of the 2005 IEEE International Conference on Robotics and Automation. Piscataway:IEEE, 2005:2081-2086.
[2] 张耿宁, 王家宝, 张亚非, 等. 基于特征融合的行人重识别方法[J]. 计算机工程与应用, 2017, 53(12):185-189, 240. (ZHANG G N, WANG J B, ZHANG Y F, et al. Person re-identification method based on feature fusion[J]. Computer Engineering and Applications, 2017, 53(12):185-189, 240.)
[3] 朱小波, 车进. 基于特征融合与子空间学习的行人重识别算法[J]. 激光与光电子学进展, 2019, 56(2):No. 021503.(ZHU X B, CHE J. Person re-identification algorithm based on feature fusion and subspace learning[J]. Laser and Optoelectronics Progress, 2019, 56(2):No. 021503.)
[4] FARENZENA M, BAZZANI L, PERINA A, et al. Person re-identification by symmetry-driven accumulation of local features[C]//Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2010:2360-2367.
[5] HAMDOUN O, MOUTARDE F, STANCIULESCU B, et al. Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences[C]//Proceedings of the 2nd ACM/IEEE International Conference on Distributed Smart Cameras. Piscataway:IEEE, 2008:1-6.
[6] CHEN H R, WANG Y W, SHI Y M, et al. Deep transfer learning for person re-identification[C]//Proceedings of the IEEE 4th International Conference on Multimedia Big Data. Piscataway:IEEE, 2018:1-5.
[7] LIN Y T, ZHENG L, ZHENG Z D, et al. Improving person reidentification by attribute and identity learning[J]. Pattern Recognition, 2019, 95:151-161.
[8] ZHONG Z, ZHENG L, ZHENG Z D, et al. Camera style adaptation for person re-identification[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:5157-5166.
[9] WEI L H, ZHANG S L, GAO W, et al. Person transfer GAN to bridge domain gap for person re-identification[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:79-88.
[10] VARIOR R R, SHUAI B, LU J W, et al. A siamese long shortterm memory architecture for human re-identification[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham:Springer, 2016:135-153.
[11] 刘紫燕, 万培佩. 基于注意力机制的行人重识别特征提取方法[J]. 计算机应用, 2020, 40(3):672-676.(LIU Z Y, WAN P P. Pedestrian re-identification feature extraction method based on attention mechanism[J]. Journal of Computer Applications, 2020, 40(3):672-676.)
[12] HOU Q B, ZHANG L, CHENG M M, et al. Strip pooling:rethinking spatial pooling for scene parsing[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2020:4002-4011.
[13] YE M, SHEN J B, LIN G J, et al. Deep learning for person re-identification:a survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021(Early Access):1-1.
[14] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[15] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6230-6239.
[16] WANG Y Q, WU T H, YANG J G, et al. DeOccNet:learning to see through foreground occlusions in light fields[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE, 2020:118-127.
[17] ZHANG X H, ZOU Y X, SHI W. Dilated convolution neural network with LeakyReLU for environmental sound classification[C]//Proceedings of the 22nd International Conference on Digital Signal Processing. Piscataway:IEEE, 2017:1-5.
[18] ZHENG L, SHEN L Y, TIAN L, et al. Scalable person re-identification:a benchmark[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2015:1116-1124.
[19] LI W, ZHAO R, XIAO T, et al. DeepReID:deep filter pairing neural network for person re-identification[C]//Proceedings of the 2014 IEEE International Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2014:152-159.
[20] RISTANI E, SOLERA F, ZOU R, et al. Performance measures and a data set for multi-target[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham:Springer, 2016:17-35.
[21] ZHENG L, HUANG Y J, LU H C, et al. Pose-invariant embedding for deep person re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(9):4500-4509.
[22] XU J, ZHAO R, ZHU F, et al. Attention-aware compositional network for person re-identification[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:2119-2128.
[23] LI W, ZHU X T, GONG S G. Harmonious attention network for person re-identification[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:2285-2294.
[24] LUO H, JIANG W, GU Y Z, et al. A strong baseline and batch normalization neck for deep person re-identification[J]. IEEE Transactions on Multimedia, 2020, 22(10):2597-2609.
[25] CHEN T L, DING S J, XIE J Y, et al. ABD-Net:attentive but diverse person re-identification[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE, 2019:8350-8360.