《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (1): 61-66.DOI: 10.11772/j.issn.1001-9081.2021111950

• 人工智能 • 上一篇    下一篇

基于Faster R-CNN的密集人群检测算法

邹斌1,2, 张聪1,2   

  1. 1.现代汽车零部件技术湖北省重点实验室(武汉理工大学),武汉 430070
    2.汽车零部件技术湖北省协同创新中心(武汉理工大学),武汉 430070
  • 收稿日期:2021-11-15 修回日期:2022-04-19 发布日期:2023-01-12
  • 通讯作者: 张聪(1996—),男,河南安阳人,硕士研究生,主要研究方向:计算机视觉、深度学习。291923685@qq.com
  • 作者简介:邹斌(1977—),男,湖北武汉人,副教授,博士,主要研究方向:智能车辆控制、智能工程车、智能汽车测试;张聪(1996—),男,河南安阳人,硕士研究生,主要研究方向:计算机视觉、深度学习;
  • 基金资助:
    湖北省重点研发项目(2020BAB135);新能源汽车科学与关键技术学科创新引智基地项目(B17034)。

Dense crowd detection algorithm based on Faster R-CNN

ZOU Bin1,2, ZHANG Cong1,2   

  1. 1.Hubei Key Laboratory of Advanced Technology for Automotive Components (Wuhan University of Technology),Wuhan Hubei 430070, China
    2.Hubei Collaborative Innovation Center for Automotive Components Technology (Wuhan University of Technology),Wuhan Hubei 430070, China
  • Received:2021-11-15 Revised:2022-04-19 Online:2023-01-12
  • Contact: ZHANG Cong, born in 1996, M. S. candidate. His research interests include computer vision, deep learning.
  • About author:ZOU Bin, born in 1977, Ph. D., associate professor. His research interests include intelligent vehicle control, intelligent engineering vehicle, intelligent vehicle test;ZHANG Cong, born in 1996, M. S. candidate. His research interests include computer vision, deep learning;
  • Supported by:
    This work is partially supported by Key Research and Development Program of Hubei Province (2020BAB135), Project of New Energy Vehicle Science and Key Technology Subject Innovation and Intelligence Introduction Base (B17034).

摘要: 为提高拥挤场景下的人群检测准确率,提出一种基于改进Faster R-CNN的密集人群检测算法。首先,在特征提取阶段添加空间与通道注意力机制,使用加强的双向特征金字塔网络(S-BiFPN)替代原网络中的多尺度特征金字塔(FPN),使网络对重要特征进行自主学习并加强对图像深层特征的提取;其次,引入多实例预测(MIP)算法对实例进行预测,以避免模型对拥挤场景下的目标造成漏检;最后,对模型中的非极大值抑制(NMS)进行优化,并额外增设一个交并比(IoU)阈值,以对检测结果的干扰项进行精确抑制。在开源的密集人群检测数据集上进行测试的结果显示,相较于原Faster R-CNN算法,所提算法的平均精度(AP)提升5.6%,Jaccard指数值提升3.2%。所提算法具有较高检测精度和稳定性,可以满足密集场景人群检测的需求。

关键词: 密集人群检测, Faster R-CNN, 注意力机制, 多实例预测, 加强的双向特征金字塔网络

Abstract: In order to improve the accuracy of crowd detection in crowded scenes, a dense crowd detection algorithm based on improved Faster Region-based Convolutional Neural Network (Faster R-CNN) was proposed. Firstly, the spatial and channel attention mechanisms were added to feature extraction stage and Strong-Bidirectional Feature Pyramid Network(S-BiFPN) was used to replace the multi-scale Feature Pyramid Network (FPN) in the original network, so that the network was able to autonomously learn important features and the extraction of deep image features was strengthened. Secondly, Multi-Instance Prediction (MIP) algorithm was introduced to predict instances, thus avoiding the model’s missed detection of targets in crowded scenes. Finally, Non-Maximum Suppression (NMS) in the model was optimized, and an additional Intersection over Union (IoU) threshold was added to accurately suppress the interference items of the detection results. Experimental results on the open source dense crowd detection dataset show that compared with the original Faster R-CNN algorithm, the proposed algorithm has the Average Precision (AP) increased by 5.6%, and Jaccard index value increased by 3.2%. The proposed algorithm has high detection precision and stability, which can meet the needs of crowd detection in dense scenes.

Key words: dense crowd detection, Faster Region-based Convolutional Neural Network (Faster R-CNN), attention mechanism, multi-instance prediction, Strong-Bidirectional Feature Pyramid Network (S-BiFPN)

中图分类号: