《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (9): 2886-2892.DOI: 10.11772/j.issn.1001-9081.2023091269
收稿日期:
2023-09-18
修回日期:
2023-12-12
接受日期:
2023-12-15
发布日期:
2024-03-21
出版日期:
2024-09-10
通讯作者:
赵志强
作者简介:
马培红(1998—),女,河南郑州人,硕士研究生,主要研究方向:计算机视觉基金资助:
Zhiqiang ZHAO1,2(), Peihong MA1, Xinhong HEI1,2
Received:
2023-09-18
Revised:
2023-12-12
Accepted:
2023-12-15
Online:
2024-03-21
Published:
2024-09-10
Contact:
Zhiqiang ZHAO
About author:
MA Peihong, born in 1998, M. S. candidate. Her research interests include computer vision.Supported by:
摘要:
针对复杂场景下人群计数问题中的尺度变化、背景干扰和部分遮挡等问题,在空洞卷积操作的基础上,提出一种基于双重注意力机制的空洞上下文卷积神经网络(DA-DCCNN)。首先,将VGG16中的卷积层作为特征提取器,获取人群图像抽象、深层的特征图;其次,利用空洞卷积构造空洞上下文模块(DCM)对不同层获取的特征进行连接,并引入空间注意力模块(SAM)和通道注意力模块(CAM)获取上下文信息;最后,组合欧氏距离和交叉熵构造损失函数,对网络预测注意力图和真实注意力图之间的差异进行度量。在ShanghaiTech、UCF_CC_50和UCF-QNRF 3个公开数据集上的实验结果表明,DA-DCCNN在有效获取图像的多尺度特征的同时,增强了对图像中重要区域和通道的感知能力,平均绝对误差(MAE)取得了相对最优的结果。基于双重注意力机制的特征融合网络能有效感知图像中的空间结构和局部特征,从而使得生成的密度图能更准确地对人群区域进行预测和计数。
中图分类号:
赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 计算机应用, 2024, 44(9): 2886-2892.
Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism[J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
数据集 | 样本数 | 标注点数 | 分辨率 | 人群规模 |
---|---|---|---|---|
ShanghaiTech | 1 198 | 330 165 | 768×1 024 | 9~578 |
UCF_CC_50 | 50 | 63 075 | 2 101×2 888 | 94~4 542 |
UCF-QNRF | 1 535 | >125×104 | 2 013×2 902 | 49~12 865 |
表1 3个数据集的详细信息
Tab. 1 Detail information of three datasets
数据集 | 样本数 | 标注点数 | 分辨率 | 人群规模 |
---|---|---|---|---|
ShanghaiTech | 1 198 | 330 165 | 768×1 024 | 9~578 |
UCF_CC_50 | 50 | 63 075 | 2 101×2 888 | 94~4 542 |
UCF-QNRF | 1 535 | >125×104 | 2 013×2 902 | 49~12 865 |
数据集 | 方法 | MAE | RMSE |
---|---|---|---|
ShanghaiTech | HA-CNN | 62.9 | 94.9 |
ADCrowdNet | 55.4 | 97.9 | |
PCC Net | 73.5 | 124.0 | |
LSC-CNN | 66.4 | 117.0 | |
EPA | 60.9 | 91.6 | |
Lw-Count | 69.7 | 100.5 | |
CG-DRCN | 60.2 | 94.0 | |
DA-DCCNN | 49.6 | 87.1 | |
UCF_CC_50 | ACSCP | 291.0 | 404.6 |
MRA-CNN | 240.8 | 352.6 | |
RANet | 239.8 | 319.4 | |
MBTTBF-SCFB | 233.1 | 300.9 | |
PCC Net | 240.0 | 315.5 | |
LSC-CNN | 225.6 | 302.7 | |
SDS-CNN | 229.4 | 325.6 | |
EPA | 250.1 | 342.7 | |
Lw-Count | 239.3 | 307.6 | |
HANet | 195.2 | 268.6 | |
DA-DCCNN | 165.3 | 227.7 | |
UCF-QNRF | RANet | 111.0 | 190.0 |
MBTTBF-SCFB | 97.5 | 165.2 | |
PCC Net | 246.4 | 247.1 | |
LSC-CNN | 120.5 | 218.2 | |
KDMG | 99.5 | 173.0 | |
SDS-CNN | 115.2 | 175.7 | |
Lw-Count | 149.7 | 238.4 | |
HANet | 99.1 | 159.2 | |
DA-DCCNN | 93.3 | 160.2 |
表2 不同方法在3个数据集上的性能对比
Tab. 2 Performance comparison on three datasets among different methods
数据集 | 方法 | MAE | RMSE |
---|---|---|---|
ShanghaiTech | HA-CNN | 62.9 | 94.9 |
ADCrowdNet | 55.4 | 97.9 | |
PCC Net | 73.5 | 124.0 | |
LSC-CNN | 66.4 | 117.0 | |
EPA | 60.9 | 91.6 | |
Lw-Count | 69.7 | 100.5 | |
CG-DRCN | 60.2 | 94.0 | |
DA-DCCNN | 49.6 | 87.1 | |
UCF_CC_50 | ACSCP | 291.0 | 404.6 |
MRA-CNN | 240.8 | 352.6 | |
RANet | 239.8 | 319.4 | |
MBTTBF-SCFB | 233.1 | 300.9 | |
PCC Net | 240.0 | 315.5 | |
LSC-CNN | 225.6 | 302.7 | |
SDS-CNN | 229.4 | 325.6 | |
EPA | 250.1 | 342.7 | |
Lw-Count | 239.3 | 307.6 | |
HANet | 195.2 | 268.6 | |
DA-DCCNN | 165.3 | 227.7 | |
UCF-QNRF | RANet | 111.0 | 190.0 |
MBTTBF-SCFB | 97.5 | 165.2 | |
PCC Net | 246.4 | 247.1 | |
LSC-CNN | 120.5 | 218.2 | |
KDMG | 99.5 | 173.0 | |
SDS-CNN | 115.2 | 175.7 | |
Lw-Count | 149.7 | 238.4 | |
HANet | 99.1 | 159.2 | |
DA-DCCNN | 93.3 | 160.2 |
组合序号 | 方法 | MAE | RMSE |
---|---|---|---|
① | VGG | 81.2 | 119.4 |
② | VGG+DCM | 60.7 | 95.4 |
③ | VGG+CAM | 79.0 | 114.9 |
④ | VGG+SAM | 75.3 | 109.7 |
⑤ | VGG+DAM | 69.1 | 98.1 |
⑥ | VGG+DCM+CAM | 58.3 | 92.1 |
⑦ | VGG+DCM+SAM | 53.2 | 90.8 |
⑧ | DA-DCCNN | 49.6 | 87.1 |
表3 不同模块组合的实验结果
Tab. 3 Experimental results over combinations of various modules
组合序号 | 方法 | MAE | RMSE |
---|---|---|---|
① | VGG | 81.2 | 119.4 |
② | VGG+DCM | 60.7 | 95.4 |
③ | VGG+CAM | 79.0 | 114.9 |
④ | VGG+SAM | 75.3 | 109.7 |
⑤ | VGG+DAM | 69.1 | 98.1 |
⑥ | VGG+DCM+CAM | 58.3 | 92.1 |
⑦ | VGG+DCM+SAM | 53.2 | 90.8 |
⑧ | DA-DCCNN | 49.6 | 87.1 |
λ | MAE | RMSE | λ | MAE | RMSE |
---|---|---|---|---|---|
0 | 54.3 | 94.2 | 10-4 | 49.6 | 87.1 |
10-5 | 53.2 | 92.1 | 10-3 | 50.1 | 90.1 |
表4 不同λ对DA-DCCNN性能的影响
Tab. 4 Influence of different λ on performance of DA-DCCNN
λ | MAE | RMSE | λ | MAE | RMSE |
---|---|---|---|---|---|
0 | 54.3 | 94.2 | 10-4 | 49.6 | 87.1 |
10-5 | 53.2 | 92.1 | 10-3 | 50.1 | 90.1 |
σ | MAE | RMSE |
---|---|---|
0.001 | 73.4 | 112.6 |
0.010 | 56.4 | 96.2 |
0.100 | 52.8 | 90.1 |
0.200 | 49.6 | 87.1 |
0.300 | 51.3 | 90.7 |
表5 不同σ对DA-DCCNN性能的影响
Tab. 5 Influence of different σ on performance of DA-DCCNN
σ | MAE | RMSE |
---|---|---|
0.001 | 73.4 | 112.6 |
0.010 | 56.4 | 96.2 |
0.100 | 52.8 | 90.1 |
0.200 | 49.6 | 87.1 |
0.300 | 51.3 | 90.7 |
t | MAE | RMSE |
---|---|---|
0.000 1 | 54.6 | 97.3 |
0.001 0 | 49.6 | 87.1 |
0.005 0 | 53.8 | 89.5 |
0.010 0 | 57.7 | 96.4 |
0.100 0 | 58.3 | 95.6 |
表6 不同t对DA-DCCNN性能的影响
Tab. 6 Influence of different t on performance of DA-DCCNN
t | MAE | RMSE |
---|---|---|
0.000 1 | 54.6 | 97.3 |
0.001 0 | 49.6 | 87.1 |
0.005 0 | 53.8 | 89.5 |
0.010 0 | 57.7 | 96.4 |
0.100 0 | 58.3 | 95.6 |
1 | 余鹰,朱慧琳,钱进,等. 基于深度学习的人群计数研究综述[J]. 计算机研究与发展, 2021, 58(12):2724-2747. |
YU Y, ZHU H L, QIAN J, et al. Survey on deep learning based crowd counting [J]. Journal of Computer Research and Development, 2021, 58(12): 2724-2747. | |
2 | KHAN M A, MENOUAR H, HAMILA R. Revisiting crowd counting: state-of-the-art, trends, and future perspectives [J]. Image and Vision Computing, 2023, 129: 104597. |
3 | 覃勋辉, 王修飞, 周曦,等. 多种人群密度场景下的人群计数[J]. 中国图象图形学报, 2013, 18(4): 392-398. |
QIN X H, WANG X F, ZHOU X, et al. Counting people in various crowed density scenes using support vector regression [J]. Journal of Image and Graphics, 2013, 18(4): 392-398. | |
4 | RYAN D, DENMAN S, SRIDHARAN S, et al. An evaluation of crowd counting methods, features and regression models [J]. Computer Vision and Image Understanding, 2015, 130: 1-17. |
5 | KOK V J, LIM M K, CHAN C S. Crowd behavior analysis: a review where physics meets biology [J]. Neurocomputing, 2016, 177: 342-362. |
6 | POUYANFAR S, SADIQ S, YAN Y, et al. A survey on deep learning: algorithms, techniques, and applications [J]. ACM Computing Surveys, 2018, 51(5): No. 92. |
7 | ALZUBAIDI L, ZHANG J, HUMAIDI A J, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions [J]. Journal of Big Data, 2021, 8(1): No. 53. |
8 | PTAK B, PIECZYŃSKI D, PIECHOCKI M, et al. On-board crowd counting and density estimation using low altitude unmanned aerial vehicles: looking beyond beating the benchmark [J]. Remote Sensing, 2022, 14(10): 2288. |
9 | DELUSSU R, PUTZU L, FUMERA G. Scene-specific crowd counting using synthetic training images [J]. Pattern Recognition, 2022, 124: 108484. |
10 | FU M, XU P, LI X, et al. Fast crowd density estimation with convolutional neural networks [J]. Engineering Applications of Artificial Intelligence, 2015, 43: 81-88. |
11 | STEWART R, ANDRILUKA M, NG A Y. End-to-end people detection in crowded scenes [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2325-2333. |
12 | LI W, LI H, QU Q, et al. HeadNet: an end-to-end adaptive relational network for head detection [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(2): 482-494. |
13 | ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 589-597. |
14 | LI Y, ZHANG X, CHEN D. CSRNet: dilated convolutional neural networks for understanding the highly congested scenes[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1091-1100. |
15 | CAO X, WANG Z, ZHAO Y, et al. Scale aggregation network for accurate and efficient crowd counting [C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 757-773. |
16 | VARIOR R R, SHUAI B, TIGHE J, et al. Multi-scale attention network for crowd counting [EB/OL]. [2023-02-19]. . |
17 | SINSAGI V A, PATEL V M. HA-CNN: hierarchical attention-based crowd counting network [J]. IEEE Transactions on Image Processing, 2019, 29: 323-335. |
18 | LIU N, LONG Y, ZOU C, et al. ADCrowdNet: an attention-injective deformable convolutional network for crowd understanding[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3220-3229. |
19 | SHANG C, AI H, BAI B. End-to-end crowd counting via joint learning local and global count [C]// Proceedings of the 2016 IEEE International Conference on Image Processing. Piscataway: IEEE, 2016: 1215-1219. |
20 | LIU W, SALZMANN M, FUA P. Context-aware crowd counting[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5094-5103. |
21 | SHI M, YANG Z, XU C, et al. Revisiting perspective information for efficient crowd counting [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7271-7280. |
22 | YAN Z, ZHANG R, ZHANG H, et al. Crowd counting via perspective-guided fractional-dilation convolution [J]. IEEE Transactions on Multimedia, 2021, 24: 2633-2647. |
23 | WANG X, LV R, ZHAO Y, et al. Multi-scale context aggregation network with attention-guided for crowd counting [C]// Proceedings of the 2020 15th IEEE International Conference on Signal Processing. Piscataway: IEEE, 2020, 1: 240-245. |
24 | ZHANG C, LI H, WANG X, et al. Cross-scene crowd counting via deep convolutional neural networks [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 833-841. |
25 | IDREES H, TAYYAB M, ATHREY K, et al. Composition loss for counting, density map estimation and localization in dense crowds [C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 544-559. |
26 | GAO J, WANG Q, LI X. PCC Net: perspective crowd counting via spatial convolutional network [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(10): 3846-3498. |
27 | SAM D B, PERI S V, SUNDARARAMAN M N, et al. Locate, size and count: accurately resolving people in dense crowds via detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(8): 2739-2751. |
28 | YANG Y, LI G, DU D, et al. Embedding perspective analysis into multi-column convolutional neural network for crow counting [J]. IEEE Transactions on Image Processing, 2020, 30: 1395-1407. |
29 | LIU Y, GAO G, SHI H, et al. Lw-Count: an effective lightweight encoding-decoding crowd counting network [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(10): 6821-6834. |
30 | SINDAGI V A, YASARLA R, PATEL V M. JHU-CROWD++: large-scale crowd counting dataset and a benchmark method [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(5): 2594-2609. |
31 | SHEN Z, XU Y, NI B, et al. Crowd counting via adversarial cross-scale consistency pursuit [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 5245-5254. |
32 | ZHANG Y, ZHOU C, CHANG F, et al. Multi-resolution attention convolutional neural network for crowd counting [J]. Neurocomputing, 2019, 329: 144-152. |
33 | SINDAGI V A, PATEL V M. Multi-level bottom-top and top-bottom feature fusion for crowd counting [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1002-1012. |
34 | WANG F, SANG J, WU Z, et al. Hybrid attention network based on progressive embedding scale-context for crowd counting [J]. Information Sciences, 2022, 591: 306-318. |
35 | ZHANG A, SHEN J, XIAO Z, et al. Relational attention network for crowd counting [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6787-6796. |
36 | KHAN S D, BASALAMAH S. Sparse to dense scale prediction for crowd counting in high density crowds [J]. Arabian Journal of Science and Engineering, 2021, 46: 3051-3065. |
37 | WAN J, WANG Q, CHAN A B. Kernel-based density map generation for dense object counting [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1357-1370. |
[1] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
[2] | 高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242. |
[3] | 李伟, 张晓蓉, 陈鹏, 李清, 张长青. 基于正态逆伽马分布的多尺度融合人群计数算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2243-2249. |
[4] | 封筠, 毕健康, 霍一儒, 李家宽. 轻量化沥青路面裂缝图像分割网络PIPNet[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1520-1526. |
[5] | 蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944. |
[6] | 王林, 刘景亮, 王无为. 基于空洞卷积融合Transformer的无人机图像小目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3595-3602. |
[7] | 梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618. |
[8] | 刘辉, 张琳玉, 王复港, 何如瑾. 基于注意力机制和上下文信息的目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1557-1564. |
[9] | 李佳东, 张丹普, 范亚琼, 杨剑锋. 基于改进YOLOv5的轻量级船舶目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 923-929. |
[10] | 张志昂, 廖光忠. 基于U-Net的多尺度特征增强视网膜血管分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3275-3281. |
[11] | 余晓鹏, 何儒汉, 黄晋, 张俊杰, 胡新荣. 基于改进Inception结构的知识图谱嵌入模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1065-1071. |
[12] | 刘朋伟, 高媛, 秦品乐, 殷喆, 王丽芳. 基于多感受野的生成对抗网络医学MRI影像超分辨率重建[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 938-945. |
[13] | 陈薪羽, 刘明哲, 任俊, 汤影. 基于多列卷积神经网络的参数异步更新算法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 395-403. |
[14] | 许慧青, 陈斌, 王敬飞, 陈志毅, 覃健. 基于卷积神经网络的细长路面病害检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 265-272. |
[15] | 冯兴杰, 张天泽. 基于分组卷积进行特征融合的全景分割算法[J]. 计算机应用, 2021, 41(7): 2054-2061. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||