计算机应用 ›› 2019, Vol. 39 ›› Issue (1): 199-204.DOI: 10.11772/j.issn.1001-9081.2018051132

• 人工智能 • 上一篇    下一篇

基于多尺度多任务卷积神经网络的人群计数

曹金梦1, 倪蓉蓉2, 杨彪1   

  1. 1. 常州大学 信息科学与工程学院, 江苏 常州 213164;
    2. 常州纺织服装职业技术学院 能源管理科, 江苏 常州 213164
  • 收稿日期:2018-06-01 修回日期:2018-08-03 出版日期:2019-01-10 发布日期:2019-01-21
  • 通讯作者: 杨彪
  • 作者简介:曹金梦(1994-),女,江苏江阴人,硕士研究生,CCF会员,主要研究方向:深度学习、模式识别;倪蓉蓉(1987-),女,江苏南通人,硕士,主要研究方向:深度学习、模式识别;杨彪(1987-),男,江苏常州人,讲师,博士,CCF会员,主要研究方向:深度学习、模式识别。
  • 基金资助:
    国家自然科学基金资助项目(61501060);江苏省自然科学基金资助项目(BK20150271);江苏省道路载运工具新技术应用重点实验室开放课题项目(BM20082061708)。

Crowd counting using multi-scale multi-task convolutional neural network

CAO Jinmeng1, NI Rongrong2, YANG Biao1   

  1. 1. School of Information Science & Engineering, Changzhou University, Changzhou Jiangsu 213164, China;
    2. Department of Energy Management, Changzhou Vocational Institute of Textile and Garment, Changzhou Jiangsu 213164, China
  • Received:2018-06-01 Revised:2018-08-03 Online:2019-01-10 Published:2019-01-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61501060), the Natural Science Foundation of Jiangsu Province (BK20150271), the Key Laboratory for New Technology Application of Road Conveyance of Jiangsu Province (BM20082061708).

摘要: 在智能监控领域,实现人群计数具有重要价值,针对人群尺度不一、人群密度分布不均及遮挡等问题,提出一种多尺度多任务卷积神经网络(MMCNN)进行人群计数的方法。首先提出一种新颖的自适应人形核生成密度图描述人群信息,消除人群遮挡影响;其次通过构建多尺度卷积神经网络解决人群尺度不一问题,以多任务学习机制同时估计密度图及人群密度等级,解决人群分布不均问题;最后设计一种加权损失函数,提高人群计数准确率。在UCF_CC_50和World Expo'10数据库上进行了评估,验证了自适应人形核的有效性。实验结果表明:所提算法比Sindagi等的方法(SINDAGI V A,PATEL V M.CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting.Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance.Piscataway,NJ:IEEE,2017:1-6)在UCF_CC_50数据库上平均绝对误差(MAE)数值和均方误差(MSE)数值分别降低约1.7和45;与Zhang等的方法(ZHANG Y,ZHOU D,CHEN S,et al.Single-image crowd counting via multi-column convolutional neural network.Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington,DC:IEEE Computer Society,2016:589-597)相比,在World Expo'10数据库上所提算法的MAE值降低约1.5,且在真实公共汽车数据库上仅0~3人的计数误差,表明其实用性较强。

关键词: 人群计数, 多尺度, 多任务学习, 卷积神经网络, 自适应人形核, 加权损失函数

Abstract: Crowd counting has played a significant role in the field of intelligent surveillance. Concerning the problem of scale variation, non-uniform density distribution and partial occlusion of crowds, a method of crowd counting using Multi-scale Multi-task Convolutional Neural Network (MMCNN) was proposed to solve existing challenges in crowd counting. Initially, a novel adaptive human-shaped kernel was used to generate a density map which described the population information, and the partial occlusion was eliminated. Then, scale variation was handled through constructing a multi-scale convolutional neural network and non-uniform density distribution was resolved by the multi-task learning mechanism, which simultaneously estimate the density map and density level of crowds. Further, a weighted loss function was proposed to improve the accuracy of crowd counting. Evaluations in UCF_CC_50 and World Expo'10 datasets revealed the effectiveness of the proposed adaptive human-shaped kernel. The experimental results show that, compared with the method proposed by Sindagi et al. (SINDAGI V A, PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway, NJ:IEEE, 2017:1-6), the Mean Absolute Error (MAE) and Mean Squared Error (MSE) of the proposed method in UCF_CC_50 dataset is decreased by 1.7 and 45 respectively. Compared with the method proposed by Zhang et al. (ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:589-597), the MAE of the proposed method in World Expo'10 dataset is decreased by 1.5. Simultaneously, evaluations in practical bus videos with an error of approximately 0-3, which verifies the practicability of the proposed counting approach.

Key words: crowd counting, multi-scale, multi-task learning, Convolutional Neural Network (CNN), adaptive human-shaped kernel, weighted loss function

中图分类号: