基于双域自注意力机制的行人属性识别

doi:10.11772/j.issn.1001-9081.2020060850

计算机应用 ›› 2021, Vol. 41 ›› Issue (2): 372-378.DOI: 10.11772/j.issn.1001-9081.2020060850

所属专题：人工智能

基于双域自注意力机制的行人属性识别

吴锐^1,2, 刘宇², 冯凯^1,2

1. 武汉邮电科学研究院, 武汉 430074;
2. 南京烽火星空通信发展有限公司, 南京 210019

收稿日期:2020-06-19 修回日期:2020-09-18 发布日期:2020-12-17 出版日期:2021-02-10
通讯作者: 吴锐
作者简介:吴锐(1996-),男,贵州黎平人,硕士研究生,主要研究方向:计算机视觉、行人属性识别、行人重识别;刘宇(1974-),男,吉林辽源人,硕士,高级工程师,主要研究方向:互联网应用、网络安全、大数据;冯凯(1996-),男,陕西延安人,硕士研究生,主要研究方向:计算机视觉、深度学习。
基金资助:
国家重点研发计划项目（2017YFBI400704）

Pedestrian attribute recognition based on two-domain self-attention mechanism

WU Rui^1,2, LIU Yu², FENG Kai^1,2

1. Wuhan Research Institute of Posts and Telecommunications, Wuhan Hubei 430074, China;
2. Nanjing Fiberhome Starrysky Company Limited, Nanjing Jiangsu 210019, China

Received:2020-06-19 Revised:2020-09-18 Online:2020-12-17 Published:2021-02-10
Supported by:
This work is partially supported by the National Key Research and Development Program of China (2017YFBI400704).

摘要/Abstract

摘要： 针对行人属性识别任务中不同属性对特征粒度和特征依赖性的需求不同的问题，提出了一种基于由空间自注意力机制和通道自注意力机制组成的双域自注意力机制的行人属性识别模型。首先，使用ResNet50作为骨干网络，提取出具有一定语义信息的特征；然后将得到的特征分别输入到双分支网络中，提取具有空间依赖性与语义相关性的自注意力特征以及整体性信息的全局特征；最后，融合双分支的特征，并利用批归一化（BN）和加权损失的策略降低行人属性样本不平衡的影响。在两个行人属性数据集PETA和RAP上的实验结果表明，所提出的模型比基准模型的平均准确率指标分别提高了3.91个百分点和4.05个百分点，在已提出的行人属性识别模型中具有较强的竞争力。基于双域自注意力机制的行人属性识别方法可在监控场景下对行人进行结构化描述，提高行人分析和检索等任务的准确度和效率。

关键词: 行人属性识别, 空间自注意力, 通道自注意力, 特征依赖, 语义相关

Abstract: Focusing on the issue that different attributes have different requirements for feature granularity and feature dependence in pedestrian attribute recognition tasks, a pedestrian attribute recognition model based on two-domain self-attention mechanism composed of spatial self-attention mechanism and channel self-attention mechanism was proposed. Firstly, ResNet50 was used as the backbone network to extract the features with certain semantic information. Then, the features were input into the two-branch network respectively to extract the self-attention features with spatial dependence and semantic relevance as well as the global features of overall information. Finally, the features of two branches were concatenated, and the strategies of Batch Normalization (BN) and weighted loss were used to reduce the impact of imbalanced pedestrian attribute samples. Experimental results on two pedestrian attribute datasets PETA and RAP show that the proposed model improves the mean accuracy index by 3.91 percentage points and 4.05 percentage points respectively compared with the benchmark model, and has strong competitiveness in the existing pedestrian attribute recognition models. The proposed pedestrian attribute recognition based on two-domain self-attention mechanism can be used to perform the structural description of pedestrians in monitoring scenarios, so as to improve the accuracy and efficiency of pedestrian analysis and retrieval tasks.

Key words: pedestrian attribute recognition, spatial self-attention, channel self-attention, feature dependence, semantic relevance

中图分类号:

TP391.4

吴锐, 刘宇, 冯凯. 基于双域自注意力机制的行人属性识别[J]. 计算机应用, 2021, 41(2): 372-378.

WU Rui, LIU Yu, FENG Kai. Pedestrian attribute recognition based on two-domain self-attention mechanism[J]. Journal of Computer Applications, 2021, 41(2): 372-378.

参考文献

[1] CAO L,DIKMEN M,FU Y,et al. Gender recognition from body[C]//Proceedings of the 16th ACM International Conference on Multimedia. New York:ACM,2008:725-728.
[2] JOO J,WANG S,ZHU S C. Human attribute recognition by rich appearance dictionary[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2013:721-728.
[3] SUDOWE P,SPITZER H,LEIBE B. Person attribute recognition with a jointly-trained holistic CNN model[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision Workshops. Piscataway:IEEE,2015:329-337.
[4] ABDULNABI A H,WANG G,LU J,et al. Multi-task CNN model for attribute prediction[J]. IEEE Transactions on Multimedia, 2015,17(11):1949-1959.
[5] LI D,CHEN X,ZHANG Z,et al. Pose guided deep model for pedestrian attribute recognition in surveillance scenarios[C]//Proceedings of the 2018 IEEE International Conference on Multimedia and Expo. Piscataway:IEEE,2018:1-6.
[6] ZHAO X,SANG L,DING G,et al. Grouping attribute recognition for pedestrian with joint recurrent learning[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2018:3177-3183.
[7] LI Q,ZHAO X,HE R,et al. Visual-semantic graph reasoning for pedestrian attribute recognition[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press, 2019:8634-8641.
[8] WANG J,ZHU X,GONG S,et al. Attribute recognition by joint recurrent learning of context and correlation[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:531-540.
[9] ZHU F,LI H,OUYANG W,et al. Learning spatial regularization with image-level supervisions for multi-label image classification[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:2027-2036.
[10] LIU X,ZHAO H,TIAN M,et al. HydraPlus-Net:attentive deep features for pedestrian analysis[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:350-359.
[11] LI Q,ZHAO X,HE R,et al. Pedestrian attribute recognition by joint visual-semantic reasoning and knowledge distillation[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:833-839.
[12] BUADES A,COLL B,MOREL J M. A non-local algorithm for image denoising[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2005:60-65.
[13] WANG X,GIRSHICK R,GUPTA A,et al. Non-local neural networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:7794-7803.
[14] ZHANG H, GOODFELLOW I, METAXAS D, et al. Selfattention generative adversarial networks[C]//Proceedings of the 36th International Conference on Machine Learning. New York:JMLR. org,2019:7354-7363.
[15] VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook,NY:Curran Associates Inc.,2017:6000-6010.
[16] ZHAO H,JIA J,KOLTUN V. Exploring self-attention for image recognition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2020:10073-10082.
[17] HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:7132-7141.
[18] WOO S,PARK J,LEE J Y,et al. CBAM:convolutional block attention module[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11211. Cham:Springer, 2018:3-19.
[19] CHEN T,DING S,XIE J,et al. ABD-Net:attentive but diverse person re-identification[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE, 2019:8350-8360.
[20] TAN Z,YANG Y,WAN J,et al. Attention-based pedestrian attribute analysis[J]. IEEE Transactions on Image Processing, 2019,28(12):6126-6140.
[21] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[22] DENG Y, LUO P, LOY C C, et al. Pedestrian attribute recognition at far distance[C]//Proceedings of the 22nd ACM International Conference on Multimedia. New York:ACM,2014:789-792.
[23] LI D,ZHANG Z,CHEN X,et al. A richly annotated pedestrian dataset for person retrieval in real surveillance scenarios[J]. IEEE Transactions on Image Processing,2019,28(4):1575-1590.
[24] DENG J,DONG W,SOCHER R,et al. ImageNet:a large-scale hierarchical image database[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2009:248-255.
[25] LI D, CHEN X, HUANG K. Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios[C]//Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition. Piscataway:IEEE,2015:111-115.
[26] LIU P,LIU X,YAN J,et al. Localization guided learning for pedestrian attribute recognition[C]//Proceedings of the 2018 British Machine Vision Conference. Durham:BMVA Press, 2018:No. 0573.
[27] 郑少飞, 汤进, 罗斌, 等. 基于改进损失函数的多阶段行人属性识别方法[J]. 模式识别与人工智能,2018,31(12):1085-1095. (ZHENG S F, TANG J, LUO B, et. al. Multistage pedestrian attribute recognition method based on improved loss function[J]. Pattern Recognition and Artificial Intelligence, 2018,31(12):1085-1095.)
[28] ZHAO X,SANG L,DING G,et al. Recurrent attention model for pedestrian attribute recognition[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:9275-9282.
[29] JI Z,HE E,WANG H,et al. Image-attribute reciprocally guided attention network for pedestrian attribute recognition[J]. Pattern Recognition Letters,2019,120:89-95.
[30] SELVARAJU R R,COGSWELL M,DAS A,et al. Grad-CAM:visual explanations from deep networks via gradient-based localization[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:618-626.

基于双域自注意力机制的行人属性识别

Pedestrian attribute recognition based on two-domain self-attention mechanism

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 11

编辑推荐

Metrics

[1]	林于翔, 吴运兵, 阴爱英, 廖祥文. 基于语义相关性分析的多模态摘要模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 65-72.
[2]	陈佳, 张鸿. 基于特征增强和语义相关性匹配的图像文本检索方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 16-23.
[3]	程南江, 余贞侠, 陈琳, 乔贺辙. 基于领域自适应的多源多标签行人属性识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2401-2406.
[4]	刘长红, 曾胜, 张斌, 陈勇. 基于语义关系图的跨模态张量融合网络的图像文本检索[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3018-3024.
[5]	刘高军, 方晓, 段建勇. 基于深度语义信息的查询扩展[J]. 计算机应用, 2020, 40(11): 3192-3197.
[6]	林乐平, 李三凤, 欧阳宁. 基于多姿态特征融合生成对抗网络的人脸校正方法[J]. 计算机应用, 2020, 40(10): 2856-2862.
[7]	代刚, 张鸿. 基于语义相关性与拓扑关系的跨媒体检索算法[J]. 计算机应用, 2018, 38(9): 2529-2534.
[8]	刘晓亮. 基于维基百科的军事舆情论坛话题追踪方法[J]. 计算机应用, 2012, 32(11): 3026-3029.
[9]	姚全珠余训滨. 基于最小相关实体子树的XML关键字查询算法[J]. 计算机应用, 2012, 32(04): 1090-1093.
[10]	黎军熊海灵. 综合文档语义与用户查询语义的XML关键字检索[J]. 计算机应用, 2010, 30(11): 2945-2948.
[11]	罗代忠赵文耘. 一种面向产品线的特征依赖建模方法[J]. 计算机应用, 2008, 28(9): 2349-2352.