Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (11): 3210-3215.DOI: 10.11772/j.issn.1001-9081.2019051051

• The 2019 CCF Conference on Artificial Intelligence (CCFAI2019) • Previous Articles     Next Articles

Fine-grained pedestrian detection algorithm based on improved Mask R-CNN

ZHU Fan, WANG Hongyuan, ZHANG Ji   

  1. College of Information Science and Engineering, Changzhou University, Changzhou Jiangsu 213164, China
  • Received:2019-05-24 Revised:2019-06-20 Online:2019-11-10 Published:2019-09-11
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61572085).

基于改进的Mask R-CNN的行人细粒度检测算法

朱繁, 王洪元, 张继   

  1. 常州大学 信息科学与工程学院, 江苏 常州 213164
  • 通讯作者: 王洪元
  • 作者简介:朱繁(1994-),女,江苏淮安人,硕士研究生,主要研究方向:计算机视觉;王洪元(1960-),男,江苏常熟人,教授,博士,CCF会员,主要研究方向:计算机视觉;张继(1981-),男,江苏常州人,讲师,硕士,CCF会员,主要研究方向:计算机视觉。
  • 基金资助:

Abstract: Aiming at the problem of poor pedestrian detection effect in complex scenes, a pedestrian detection algorithm based on improved Mask R-CNN framework was proposed with the use of the leading research results in deep learning-based object detection. Firstly, K-means algorithm was used to cluster the object frames of the pedestrian datasets to obtain the appropriate aspect ratio. By adding the set of aspect ratio (2:5), 12 anchors were able to be adapted to the size of the pedestrian in the image. Secondly, combined with the technology of fine-grained image recognition, the high accuracy of pedestrian positioning was realized. Thirdly, the foreground object was segmented by the Full Convolutional Network (FCN), and pixel prediction was performed to obtain the local mask (upper body, lower body) of the pedestrian, so as to achieve the fine-grained detection of pedestrians. Finally, the overall mask of the pedestrian was obtained by learning the local features of the pedestrian. In order to verify the effectiveness of the improved algorithm, the proposed algorithm was compared with the current representative object detection methods (such as Faster Region-based Convolutional Neural Network (Faster R-CNN), YOLOv2 and R-FCN (Region-based Fully Convolutional Network)) on the same dataset. The experimental results show that the improved algorithm increases the speed and accuracy of pedestrian detection and reduces the false positive rate.

Key words: Mask R-CNN (Region with Convolutional Neural Network), pedestrian detection, K-means algorithm, fine-grained, Fully Convolutional Network (FCN)

摘要: 针对复杂场景下行人检测效果差的问题,采用基于深度学习的目标检测中领先的研究成果,提出了一种基于改进Mask R-CNN框架的行人检测算法。首先,采用K-means算法对行人数据集的目标框进行聚类得到合适的长宽比,通过增加一组长宽比(2:5)使12种anchors适应图像中行人的尺寸;然后,结合细粒度图像识别技术,实现行人的高定位精度;其次,采用全卷积网络(FCN)分割前景对象,并进行像素预测获得行人的局部掩码(上半身、下半身),实现对行人的细粒度检测;最后,通过学习行人的局部特征获得行人的整体掩码。为了验证改进算法的有效性,将其与当前具有代表性的目标检测方法(如更快速的区域卷积神经网络(Faster R-CNN)、YOLOv2、R-FCN)在同数据集上进行对比。实验结果表明,改进的算法提高了行人检测的速度和精度,并且降低了误检率。

关键词: Mask R-CNN, 行人检测, K-means算法, 细粒度, 全卷积网络

CLC Number: