基于非局部关注和多重特征融合的视频行人重识别

doi:10.11772/j.issn.1001-9081.2020050739

计算机应用 ›› 2021, Vol. 41 ›› Issue (2): 530-536.DOI: 10.11772/j.issn.1001-9081.2020050739

所属专题：多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于非局部关注和多重特征融合的视频行人重识别

刘紫燕¹, 朱明成¹, 袁磊¹, 马珊珊¹, 陈霖周廷²

1. 贵州大学大数据与信息工程学院, 贵阳 550025;
2. 贵州理工学院航空航天工程学院, 贵阳 550003

收稿日期:2020-06-01 修回日期:2020-07-27 发布日期:2020-08-14 出版日期:2021-02-10
通讯作者: 刘紫燕
作者简介:刘紫燕(1974-),女,贵州都匀人,副教授,硕士,CCF会员,主要研究方向:无线通信系统、移动机器人、大数据挖掘分析;朱明成(1993-),男,江苏江阴人,硕士研究生,主要研究方向:行人重识别;袁磊(1995-),男,贵州思南人,硕士研究生,主要研究方向:目标检测;马珊珊(1996-),女,贵州遵义人,硕士研究生,主要研究方向:深度学习;陈霖周廷(1981-),男,浙江青田人,副教授,博士,主要研究方向:智能控制。
基金资助:
贵州省科学技术基金资助项目（黔科合基础［2016］1054）；贵州省联合资金资助项目（黔科合LH字［2017］7226号）；贵州大学2017年度学术新苗培养及创新探索专项（黔科合平台人才［2017］5788）；贵州省科技计划项目（黔科合基础[2017]1069）；贵州省教育厅创新群体重大研究项目（黔教合KY字[2018]026）；贵州省普通高等学校工程研究中心项目（黔教合KY字[2018]007）；贵州省科技计划重点项目（[2019]1416）。

Video person re-identification based on non-local attention and multi-feature fusion

LIU Ziyan¹, ZHU Mingcheng¹, YUAN Lei¹, MA Shanshan¹, CHEN Lingzhouting²

1. College of Big Data and Information Engineering, Guizhou University, Guiyang Guizhou 550025, China;
2. School of Aerospace Engineering, Guizhou Institute of Technology, Guiyang Guizhou 550003, China

Received:2020-06-01 Revised:2020-07-27 Online:2020-08-14 Published:2021-02-10
Supported by:
This work is partially supported by the Natural Science Foundation of Guizhou Province ([2016]1054), the Joint Natural Science Foundation of Guizhou Province (LH[2017]7226), the 2017 Special Project of New Academic Talent Training and Innovation Exploration of Guizhou University ([2017]5788), the Guizhou Provincial Science and Technology Program ([2017]1069), the Major Research Program of Innovation Groups of Guizhou Educational Department ([2018]026), the Project of Engineering Research Center of Guizhou Colleges and Universities ([2018]007), the Key Project of Science and Technology Plan of Guizhou Province ([2019]1416).

摘要/Abstract

摘要： 现有视频行人重识别方法无法有效地提取视频连续帧之间的时空信息，因此提出一种基于非局部关注和多重特征融合的行人重识别网络来提取全局与局部表征特征和时序信息。首先嵌入非局部关注模块来提取全局特征；然后通过提取网络的低中层特征和局部特征实现多重特征融合，从而获得行人的显著特征；最后将行人特征进行相似性度量并排序，计算出视频行人重识别的精度。在大数据集MARS和DukeMTMC-VideoReID上进行实现，结果显示所提出的模型较现有的多尺度三维卷积（M3D）和学习片段相似度聚合（LCSA）模型的性能均有明显提升，平均精度均值（mAP）分别达到了81.4%和93.4%，Rank-1分别达到了88.7%和95.3%；同时在小数据集PRID2011上，所提模型的Rank-1也达到94.8%。

关键词: 视频行人重识别, 时空信息, 全局特征, 非局部关注, 特征融合

Abstract: Aiming at the fact that the existing video person re-identification methods cannot effectively extract the spatiotemporal information between consecutive frames of the video, a person re-identification network based on non-local attention and multi-feature fusion was proposed to extract global and local representation features and time series information. Firstly, the non-local attention module was embedded to extract global features. Then, the multi-feature fusion was realized by extracting the low-level and middle-level features as well as the local features, so as to obtain the salient features of the person. Finally, the similarity measurement and sorting were performed to the person features in order to calculate the accuracy of video person re-identification. The proposed model has significantly improved performance compared to the existing Multi-scale 3D Convolution (M3D) and Learned Clip Similarity Aggregation (LCSA) models with the mean Average Precision (mAP) reached 81.4% and 93.4% respectively and the Rank-1 reached 88.7% and 95.3% respectively on the large datasets MARS and DukeMTMC-VideoReID. At the same time, the proposed model has the Rank-1 reached 94.8% on the small dataset PRID2011.

Key words: video person re-identification, spatiotemporal information, global feature, non-local attention, feature fusion

中图分类号:

TP391.41

刘紫燕, 朱明成, 袁磊, 马珊珊, 陈霖周廷. 基于非局部关注和多重特征融合的视频行人重识别[J]. 计算机应用, 2021, 41(2): 530-536.

LIU Ziyan, ZHU Mingcheng, YUAN Lei, MA Shanshan, CHEN Lingzhouting. Video person re-identification based on non-local attention and multi-feature fusion[J]. Journal of Computer Applications, 2021, 41(2): 530-536.

参考文献

[1] 李幼蛟, 卓力, 张菁, 等. 行人再识别技术综述[J]. 自动化学报, 2018,44(9):1554-1568.(LI Y J,ZHUO L,ZHANG J,et al. A survey of person re-identification[J]. Acta Automatica Sinica, 2018,44(9):1554-1568.)
[2] LIU H,JIE Z,JAYASHREE K,et al. Video-based person reidentification with accumulative motion context[J]. IEEE Transactions on Circuits and Systems for Video Technology,2018, 28(10):2788-2802.
[3] 胡彬, 杨铖, 邵叶秦, 等. 基于视频的行人再识别[J]. 南京航空航天大学学报,2019,51(5):669-674.(HU B,YANG C,SHAO Y Q,et al. Video-based person re-identification[J]. Journal of Nanjing University of Aeronautics and Astronautics,2019,51(5):669-674.)
[4] WU Y,LIN Y,DONG X,et al. Exploit the unknown gradually:one-shot video-based person re-identification by stepwise learning[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:5177-5186.
[5] FU Y,WANG X,WEI Y,et al. STA:spatial-temporal attention for large-scale video-based person re-identification[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:8287-8294.
[6] HERMANS A,BEYER L,LEIBE B. In defense of the triplet loss for person re-identification[EB/OL].[2020-04-12]. https://arxiv.org/pdf/1703.07737.pdf.
[7] LIAO X, HE L, YANG Z, et al. Video-based person reidentification via 3D convolutional networks and non-local attention[C]//Proceedings of the 14th Asian Conference on Computer Vision,LNCS 11366. Cham:Springer:620-634.
[8] LI J,ZHANG S,HUANG T. Multi-scale 3D convolution network for video based person re-identification[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:8618-8625.
[9] 桑海峰, 王传正, 吕应宇, 等. 基于多信息流动卷积神经网络的行人再识别[J]. 电子学报,2019,47(2):351-357.(SANG H F, WANG C Z,LYU Y Y,et al. Person re-identification based on multi-information flow convolutional neural network[J]. Acta Electronica Sinica,2019,47(2):351-357.)
[10] ZHENG L,BIE Z,SUN Y,et al. MARS:a video benchmark for large-scale person re-identification[C]//Proceedings of the 14th European Conference on Computer Vision,LNCS 9910. Cham:Springer,2016:868-884.
[11] WU L,SHEN C,VAN DEN HENGEL A,et al. Deep recurrent convolutional networks for video-based person re-identification:an end-to-end approach[EB/OL].[2020-04-12]. https://arxiv.org/pdf/1606.01609v2.pdf.
[12] YAN Y,NI B,SONG Z,et al. Person re-identification via recurrent feature aggregation[C]//Proceedings of the 14th European Conference on Computer Vision,LNCS 9910. Cham:Springer,2016:701-716.
[13] CHUNG D,TAHBOUB K,DELP E J. A two stream Siamese convolutional neural network for person re-identification[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:1992-2000.
[14] ZHANG Z,LAN C,ZENG W,et al. Multi-granularity referenceaided attentive feature aggregation for video-based person reidentification[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2020:10404-10413.
[15] MATIYALI N,SHARMA G. Video person re-identification using learned clip similarity aggregation[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE,2020:2655-2653.
[16] WU Y, BOURAHLA O E F, LI X, et al. Adaptive graph representation learning for video person re-identification[J]. IEEE Transactions on Image Processing,2020,29:8821-8830.
[17] LI J, ZHANG S, WANG J, et al. Global-local temporal representations for video person re-identification[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE,2019:3957-3966.
[18] HOU R,MA B,CHANG H,et al. VRSTC:occlusion-free video person re-identification[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2019:7176-7185.
[19] WANG X,GIRSHICK R,GUPTA A,et al. Non-local neural networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:7794-7803.
[20] IOFFE S,SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine New York:JMLR. org,2015:448-456.
[21] WANG H,WANG Y,ZHOU Z,et al. CosFace:large margin cosine loss for deep face recognition[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:5265-5274.
[22] HIRZER M, BELEZNAI C, ROTH P M, et al. Person reidentification by descriptive and discriminative classification[C]//Proceedings of the 17th Scandinavian Conference on Image Analysis,LNCS 6688. Berlin:Springer,2011:91-102.

基于非局部关注和多重特征融合的视频行人重识别

Video person re-identification based on non-local attention and multi-feature fusion

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[2]	徐志刚, 张创. 基于门控位置编码的壁画图像多级色彩还原[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2931-2937.
[3]	丁宇伟, 石洪波, 李杰, 梁敏. 基于局部和全局特征解耦的图像去噪网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2571-2579.
[4]	刘瑞华, 郝子赫, 邹洋杨. 基于多层级精细特征融合的步态识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2250-2257.
[5]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[6]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.
[7]	韩贵金, 张馨渊, 张文涛, 黄娅. 基于多特征融合的自监督图像配准算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1597-1604.
[8]	李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444.
[9]	李鑫, 孟乔, 皇甫俊逸, 孟令辰. 基于分离式标签协同学习的YOLOv5多属性分类[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1619-1628.
[10]	黄荣, 宋俊杰, 周树波, 刘浩. 基于自监督视觉Transformer的图像美学质量评价方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1269-1276.
[11]	郑宇亮, 陈云华, 白伟杰, 陈平华. 融合事件数据和图像帧的车辆目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 931-937.
[12]	吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744.
[13]	李新叶, 侯晔凝, 孔英会, 燕志旗. 结合特征融合与增强注意力的少样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 745-751.
[14]	蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944.
[15]	贾宗泽, 高鹏飞, 马应龙, 刘晓峰, 夏海鑫. 基于注意力机制的多特征融合对话行为层次化分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 715-721.