Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (7): 2243-2249.DOI: 10.11772/j.issn.1001-9081.2023060782
• Multimedia computing and computer simulation • Previous Articles Next Articles
Wei LI1(), Xiaorong ZHANG1, Peng CHEN1, Qing LI2, Changqing ZHANG2
Received:
2023-06-27
Revised:
2023-08-24
Accepted:
2023-08-25
Online:
2023-09-04
Published:
2024-07-10
Contact:
Wei LI
About author:
ZHANG Xiaorong, born in 1989, M. S., engineer. Her research interests include argumentation on the field of armed police command and control.Supported by:
通讯作者:
李伟
作者简介:
张晓蓉(1989—),女,山西太原人,工程师,硕士,主要研究方向:武警指挥和控制领域论证;基金资助:
CLC Number:
Wei LI, Xiaorong ZHANG, Peng CHEN, Qing LI, Changqing ZHANG. Crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution[J]. Journal of Computer Applications, 2024, 44(7): 2243-2249.
李伟, 张晓蓉, 陈鹏, 李清, 张长青. 基于正态逆伽马分布的多尺度融合人群计数算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2243-2249.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023060782
模型 | 模块 | 结构 |
---|---|---|
Base Model+MSF | 多尺度信息提取模块 (膨胀率分别为1和2) | Conv(512,256,3,ReLU) |
Conv(256,128,3,ReLU) | ||
人群密度估计模块 | Conv(128,1,1,None) | |
不确定性估计模块 | Conv(128,3,1,None) | |
CSRNet+MSF | 多尺度信息提取模块 (膨胀率分别为1和2) | CSRNet的back-end (6层卷积操作) |
人群密度估计模块 | Conv(64,1,1,None) | |
不确定性估计模块 | Conv(64,3,1,None) |
Tab. 1 Network parameters of models
模型 | 模块 | 结构 |
---|---|---|
Base Model+MSF | 多尺度信息提取模块 (膨胀率分别为1和2) | Conv(512,256,3,ReLU) |
Conv(256,128,3,ReLU) | ||
人群密度估计模块 | Conv(128,1,1,None) | |
不确定性估计模块 | Conv(128,3,1,None) | |
CSRNet+MSF | 多尺度信息提取模块 (膨胀率分别为1和2) | CSRNet的back-end (6层卷积操作) |
人群密度估计模块 | Conv(64,1,1,None) | |
不确定性估计模块 | Conv(64,3,1,None) |
算法 | ShanghaiTech part A | ShanghaiTech part B | UCF-QNRF | UCF_CC_50 | ||||
---|---|---|---|---|---|---|---|---|
MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | |
Crowd-CNN[ | 181.8 | 277.7 | 32.0 | 49.8 | — | — | 467 | 498.5 |
MCNN[ | 110.2 | 173.2 | 26.4 | 41.3 | 277.0 | 426.0 | 377.6 | 509.1 |
CMTL[ | 101.3 | 152.4 | 20.0 | 31.1 | 252.0 | 514.0 | 322.8 | 341.4 |
Switch-CNN[ | 90.4 | 135.0 | 21.6 | 33.4 | 228.0 | 445.0 | 318.1 | 439.2 |
LMSFFNet[ | 85.9 | 139.9 | 9.2 | 15.1 | 112.8 | 201.6 | 105.7 | 120.3 |
Base model[ | 71.4 | 115.7 | 10.3 | 16.5 | 119.3 | 207.7 | 290.0 | 406.4 |
Base model[ | 67.1 | 110.3 | 8.3 | 12.7 | 108.5 | 185.5 | 274.3 | 363.9 |
CSRNet[ | 68.2 | 115.0 | 10.6 | 16.0 | 110.6 | 190.1 | 266.1 | 397.5 |
CSRNet[ | 67.2 | 110.0 | 8.7 | 13.2 | 105.7 | 187.5 | 266.6 | 346.9 |
Tab. 2 Experiment results of different algorithms on four datasets
算法 | ShanghaiTech part A | ShanghaiTech part B | UCF-QNRF | UCF_CC_50 | ||||
---|---|---|---|---|---|---|---|---|
MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | |
Crowd-CNN[ | 181.8 | 277.7 | 32.0 | 49.8 | — | — | 467 | 498.5 |
MCNN[ | 110.2 | 173.2 | 26.4 | 41.3 | 277.0 | 426.0 | 377.6 | 509.1 |
CMTL[ | 101.3 | 152.4 | 20.0 | 31.1 | 252.0 | 514.0 | 322.8 | 341.4 |
Switch-CNN[ | 90.4 | 135.0 | 21.6 | 33.4 | 228.0 | 445.0 | 318.1 | 439.2 |
LMSFFNet[ | 85.9 | 139.9 | 9.2 | 15.1 | 112.8 | 201.6 | 105.7 | 120.3 |
Base model[ | 71.4 | 115.7 | 10.3 | 16.5 | 119.3 | 207.7 | 290.0 | 406.4 |
Base model[ | 67.1 | 110.3 | 8.3 | 12.7 | 108.5 | 185.5 | 274.3 | 363.9 |
CSRNet[ | 68.2 | 115.0 | 10.6 | 16.0 | 110.6 | 190.1 | 266.1 | 397.5 |
CSRNet[ | 67.2 | 110.0 | 8.7 | 13.2 | 105.7 | 187.5 | 266.6 | 346.9 |
算法 | Part A | Part B | ||
---|---|---|---|---|
MAE | MSE | MAE | MSE | |
尺度1 | 68.3 | 114.4 | 9.7 | 15.1 |
尺度2 | 67.8 | 112.3 | 9.5 | 15.7 |
平均融合 | 67.9 | 112.5 | 9.2 | 14.1 |
MSF | 67.2 | 110.0 | 8.7 | 13.2 |
Tab. 3 Effectiveness analysis of multi-scale fusion
算法 | Part A | Part B | ||
---|---|---|---|---|
MAE | MSE | MAE | MSE | |
尺度1 | 68.3 | 114.4 | 9.7 | 15.1 |
尺度2 | 67.8 | 112.3 | 9.5 | 15.7 |
平均融合 | 67.9 | 112.5 | 9.2 | 14.1 |
MSF | 67.2 | 110.0 | 8.7 | 13.2 |
1 | 姬丽娜,陈庆奎,陈圆金,等. 基于GPU的视频流人群实时计数[J]. 计算机应用, 2017, 37(1): 145-152. |
JI L N, CHEN Q K, CHEN Y J, et al. Real-time crowd counting method from video stream based on GPU [J]. Journal of Computer Applications, 2017, 37(1): 145-152. | |
2 | YAO H, CAVALLARO A, BOUWMANS T, et al. Guest editorial introduction to the special issue on group and crowd behavior analysis for intelligent multicamera video surveillance[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(3): 405-408. |
3 | 付倩慧,李庆奎,傅景楠,等. 基于空间维度循环感知网络的密集人群计数模型[J]. 计算机应用, 2021, 41(2): 544-549. |
FU Q H, LI Q K, FU J N, et al. Dense crowd counting model based on spatial dimensional recurrent perception network[J]. Journal of Computer Applications, 2021, 41(2): 544-549. | |
4 | 时增林,叶阳东,吴云鹏,等. 基于序的空间金字塔池化网络的人群计数方法[J]. 自动化学报, 2016, 42(6): 866-874. |
SHI Z L, YE Y D, WU Y P, et al. Crowd counting using rank-based spatial pyramid pooling network[J]. Acta Automatica Sinica, 2016, 42(6): 866-874. | |
5 | ZENG C, MA H. Robust head-shoulder detection by PCA-based multilevel HOG-LBP detector for people counting [C]// Proceedings of the 2010 20th International Conference on Pattern Recognition. Piscataway: IEEE, 2010: 2069-2072. |
6 | CHAN A B, VASCONCELOS N. Bayesian poisson regression for crowd counting [C]// Proceedings of the 2019 IEEE 12th International Conference on Computer Vision. Piscataway: IEEE, 2009: 545-551. |
7 | CHEN J, WANG Z. Crowd counting with segmentation attention convolutional neural network[J]. IET Image Processing, 2021, 15(6): 1221-1231. |
8 | 陈美云,王必胜,曹国,等. 基于像素级注意力机制的人群计数方法[J]. 计算机应用, 2020, 40(1): 56-61. |
CHEN M Y, WANG B S, CAO G, et al. Crowd counting method based on pixel-level attention mechanism[J]. Journal of Computer Applications, 2020, 40(1): 56-61. | |
9 | ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2881-2890. |
10 | HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. |
11 | LIN T-Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. |
12 | RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation [C]// Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Cham: Springer, 2015: 234-241. |
13 | ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 589-597. |
14 | LI Y, ZHANG X, CHEN D. CDRNet: dilated convolutional neural networks for understanding the highly congested scenes [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1091-1100. |
15 | AMINI A, SCHWARTING W, SOLEIMANY A, et al. Deep evidential regression [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2020: 14927-14937. |
16 | MA H, HAN Z, ZHANG C, et al. Trustworthy multimodal regression with mixture of normal-inverse Gamma distributions [C]// Proceedings of the 35th International Conference on Neural Information Processing System. Red Hook: Curran Associates, 2021: 6881-6893. |
17 | CHAN A B, LIANG Z-S J, VASCONCELOS N. Privacy preserving crowd monitoring: counting people without people models or tracking [C]// Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2008: 1-7. |
18 | OÑNORO-RUBIO D, LÓPEZ-SASTRE R J. Towards perspective-free object counting with deep learning [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 615-629. |
19 | SAM D B, SURYA S, BABU R V. Switching convolutional neural network for crowd counting [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4031-4039. |
20 | HAFNER D, TRAN D, LILLICRAP T, et al. Noise contrastive priors for functional uncertainty[C/OL]// Proceedings of the 2019 Conference on Uncertainty in Artificial Intelligence. [2023-05-30]. . |
21 | MOLCHANOV D, ASHUKHA A, VETROV D. Variational dropout sparsifies deep neural networks [C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 2498-2507. |
22 | GAL Y, GHAHRAMANI Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning [C]// Proceedings of the 33rd International Conference on Machine Learning. New York: JMLR.org, 2016: 1050-1059. |
23 | MUKHOTI J, GAL Y. Evaluating Bayesian deep learning methods for semantic segmentation [EB/OL]. (2019-03-23) [2023-07-20]. . |
24 | LAKSHMINARAYANAN B, PRITZEL A, BLUNDELL C. Simple and scalable predictive uncertainty estimation using deep ensembles [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2017: 6405-6416. |
25 | ANTORÁN J, ALLINGHAM J U, HERNÁNDEZ-LOBATO J M. Depth uncertainty in neural networks [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2020: 10620-10634. |
26 | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[C/OL]// Proceedings of the 2016 International Conference on Learning Representations. [2023-05-30]. . |
27 | KENDALL A, GAL Y. What uncertainties do we need in Bayesian deep learning for computer vision? [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2017: 5580-5590. |
28 | IDREES H, SALEEMI I, SEIBERT C, et al. Multi-source multi-scale counting in extremely dense crowd images [C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 2547-2554. |
29 | IDREES H, TAYYAB M, ATHREY K, et al. Composition loss for counting, density map estimation and localization in dense crowds [C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 544-559. |
30 | ZHANG C, LI H, WANG X, et al. Cross-scene crowd counting via deep convolutional neural networks [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 833-841. |
31 | SINDAGI V A, PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting [C]// Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway: IEEE, 2017: 1-6. |
32 | YI J, SHEN Z, CHEN F, et al. A lightweight multiscale feature fusion network for remote sensing object counting[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5902113. |
33 | WANG Q, GAO J, LIN W, et al. Learning from synthetic data for crowd counting in the wild [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 8198-8207. |
[1] | Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242. |
[2] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[3] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[4] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. |
[5] | Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention [J]. Journal of Computer Applications, 2024, 44(7): 2047-2054. |
[6] | Yan ZHOU, Yang LI. Rectified cross pseudo supervision method with attention mechanism for stroke lesion segmentation [J]. Journal of Computer Applications, 2024, 44(6): 1942-1948. |
[7] | Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759. |
[8] | Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG. Time series classification method based on multi-scale cross-attention fusion in time-frequency domain [J]. Journal of Computer Applications, 2024, 44(6): 1842-1847. |
[9] | Xiaohui CHENG, Yuntian HUANG, Ruifang ZHANG. Lightweight infrared road scene detection model based on multiscale and weighted coordinate attention [J]. Journal of Computer Applications, 2024, 44(6): 1927-1934. |
[10] | Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444. |
[11] | Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944. |
[12] | Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744. |
[13] | Jia WANG-ZHU, Zhou YU, Jun YU, Jianping FAN. Video dynamic scene graph generation model based on multi-scale spatial-temporal Transformer [J]. Journal of Computer Applications, 2024, 44(1): 47-57. |
[14] | Hao YANG, Yi ZHANG. Feature pyramid network algorithm based on context information and multi-scale fusion importance awareness [J]. Journal of Computer Applications, 2023, 43(9): 2727-2734. |
[15] | Hong WANG, Qing QIAN, Huan WANG, Yong LONG. Lightweight image tamper localization algorithm based on large kernel attention convolution [J]. Journal of Computer Applications, 2023, 43(9): 2692-2699. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||