计算机应用 ›› 2021, Vol. 41 ›› Issue (2): 372-378.DOI: 10.11772/j.issn.1001-9081.2020060850

所属专题: 人工智能

• 人工智能 • 上一篇    下一篇

基于双域自注意力机制的行人属性识别

吴锐1,2, 刘宇2, 冯凯1,2   

  1. 1. 武汉邮电科学研究院, 武汉 430074;
    2. 南京烽火星空通信发展有限公司, 南京 210019
  • 收稿日期:2020-06-19 修回日期:2020-09-18 出版日期:2021-02-10 发布日期:2020-12-17
  • 通讯作者: 吴锐
  • 作者简介:吴锐(1996-),男,贵州黎平人,硕士研究生,主要研究方向:计算机视觉、行人属性识别、行人重识别;刘宇(1974-),男,吉林辽源人,硕士,高级工程师,主要研究方向:互联网应用、网络安全、大数据;冯凯(1996-),男,陕西延安人,硕士研究生,主要研究方向:计算机视觉、深度学习。
  • 基金资助:
    国家重点研发计划项目(2017YFBI400704)

Pedestrian attribute recognition based on two-domain self-attention mechanism

WU Rui1,2, LIU Yu2, FENG Kai1,2   

  1. 1. Wuhan Research Institute of Posts and Telecommunications, Wuhan Hubei 430074, China;
    2. Nanjing Fiberhome Starrysky Company Limited, Nanjing Jiangsu 210019, China
  • Received:2020-06-19 Revised:2020-09-18 Online:2021-02-10 Published:2020-12-17
  • Supported by:
    This work is partially supported by the National Key Research and Development Program of China (2017YFBI400704).

摘要: 针对行人属性识别任务中不同属性对特征粒度和特征依赖性的需求不同的问题,提出了一种基于由空间自注意力机制和通道自注意力机制组成的双域自注意力机制的行人属性识别模型。首先,使用ResNet50作为骨干网络,提取出具有一定语义信息的特征;然后将得到的特征分别输入到双分支网络中,提取具有空间依赖性与语义相关性的自注意力特征以及整体性信息的全局特征;最后,融合双分支的特征,并利用批归一化(BN)和加权损失的策略降低行人属性样本不平衡的影响。在两个行人属性数据集PETA和RAP上的实验结果表明,所提出的模型比基准模型的平均准确率指标分别提高了3.91个百分点和4.05个百分点,在已提出的行人属性识别模型中具有较强的竞争力。基于双域自注意力机制的行人属性识别方法可在监控场景下对行人进行结构化描述,提高行人分析和检索等任务的准确度和效率。

关键词: 行人属性识别, 空间自注意力, 通道自注意力, 特征依赖, 语义相关

Abstract: Focusing on the issue that different attributes have different requirements for feature granularity and feature dependence in pedestrian attribute recognition tasks, a pedestrian attribute recognition model based on two-domain self-attention mechanism composed of spatial self-attention mechanism and channel self-attention mechanism was proposed. Firstly, ResNet50 was used as the backbone network to extract the features with certain semantic information. Then, the features were input into the two-branch network respectively to extract the self-attention features with spatial dependence and semantic relevance as well as the global features of overall information. Finally, the features of two branches were concatenated, and the strategies of Batch Normalization (BN) and weighted loss were used to reduce the impact of imbalanced pedestrian attribute samples. Experimental results on two pedestrian attribute datasets PETA and RAP show that the proposed model improves the mean accuracy index by 3.91 percentage points and 4.05 percentage points respectively compared with the benchmark model, and has strong competitiveness in the existing pedestrian attribute recognition models. The proposed pedestrian attribute recognition based on two-domain self-attention mechanism can be used to perform the structural description of pedestrians in monitoring scenarios, so as to improve the accuracy and efficiency of pedestrian analysis and retrieval tasks.

Key words: pedestrian attribute recognition, spatial self-attention, channel self-attention, feature dependence, semantic relevance

中图分类号: