《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2065-2071.DOI: 10.11772/j.issn.1001-9081.2021050852

• 人工智能 • 上一篇    

基于EfficientNet的双分路多尺度联合学习行人再识别

仇天昊(), 陈淑荣   

  1. 上海海事大学 信息工程学院,上海 201306
  • 收稿日期:2021-05-24 修回日期:2021-09-18 接受日期:2021-09-24 发布日期:2021-09-18 出版日期:2022-07-10
  • 通讯作者: 仇天昊
  • 作者简介:陈淑荣(1972—),女,山西稷山人,副教授,硕士,主要研究方向:现代通信网络及控制、图像和视频分析处理。

EfficientNet based dual-branch multi-scale integrated learning for pedestrian re-identification

Tianhao QIU(), Shurong CHEN   

  1. College of Information Engineering,Shanghai Maritime University,Shanghai 201306,China
  • Received:2021-05-24 Revised:2021-09-18 Accepted:2021-09-24 Online:2021-09-18 Published:2022-07-10
  • Contact: Tianhao QIU
  • About author:CHEN Shurong, born in 1972, M. S, associate professor. Her research interests include modern communication network and control, image and video analysis and processing.

摘要:

针对视频图像中因小目标行人、遮挡和行人姿态多变而造成的行人再识别率低的问题,建立了一种基于高效网络EfficientNet的双分路多尺度联合学习方法。首先采用性能高效的EfficientNet-B1网络作为主干结构;然后利用加权双向特征金字塔(BiFPN)分支对提取的不同尺度全局特征进行融合,并且得到包含不同层次语义信息的全局特征,从而提高小目标行人的识别率;其次利用PCB分支提取深层局部特征来挖掘行人的非显著信息,并减轻行人遮挡和姿态多变性对识别率的影响;最后在训练阶段将两个分支网络分别提取的行人特征通过Softmax损失函数得到不同子损失,并把它们相加进行联合表示;在测试阶段将获得的全局特征和深层局部特征拼接融合,并计算欧氏距离得到再识别匹配结果。该方法在Market1501和DukeMTMC-Reid 数据集上的Rank-1的准确率分别达到了95.1%和89.1%,与原始EfficientNet-B1主干结构相比分别提高了3.9个百分点和2.3个百分点。实验结果表明,所提出的模型有效提高了行人再识别的准确率。

关键词: 行人再识别, EfficientNet, 局部特征提取, 多尺度特征提取, 联合学习

Abstract:

In order to deal with the problem of low pedestrian re-identification rate in video images due to small target pedestrians, occlusions and variable pedestrian postures, a dual-channel multi-scale integrated learning method was established based on efficient network EfficientNet. Firstly, EfficientNet-B1 (EfficientNet-Baseline1) network was used as the backbone structure. Secondly, a weighted Bidirectional Feature Pyramid Network (BiFPN) branch was used to integrate the extracted global features at different scales. In order to improve the identification rate of small target pedestrians, the global features with different semantic information were obtained. Thirdly, PCB (Part-based Convolutional Baseline) branch was used to extract deep local features to mine non-significant information of pedestrians and reduce the influence of pedestrian occlusion and posture variability on identification rate. Finally, in the training stage, the pedestrian features extracted by the two branch networks respectively were calculated by the Softmax loss function to obtain different subloss functions, and they were added for joint representation. In the test stage, the global features and deep local features obtained were spliced and fused, and the Euclidean distance was calculated to obtain the pedestrian re-identification matching results. The accuracy of Rank-1 of this method on Market1501 and DukeMTMC-Reid datasets reaches 95.1% and 89.1% respectively, which is 3.9 percentage points and 2.3 percentage points higher than that of the original backbone structure respectively. Experimental results show that the proposed model improves the accuracy of pedestrian re-identification effectively.

Key words: pedestrian re-identification, EfficientNet, local feature extraction, multi-scale feature extraction, integrated learning

中图分类号: