基于双重注意力机制的人群计数方法

doi:10.11772/j.issn.1001-9081.2023091269

《计算机应用》唯一官方网站

• • 下一篇

基于双重注意力机制的人群计数方法

赵志强^1,2，马培红¹，黑新宏^1,2

1.西安理工大学计算机科学与工程学院 2. 陕西省网络计算与安全技术重点实验室（西安理工大学）

收稿日期:2023-09-15 修回日期:2023-12-12 发布日期:2024-03-21 出版日期:2024-03-21
通讯作者: 赵志强
作者简介:赵志强(1985—)，男，宁夏隆德人，讲师，博士，CCF会员，主要研究方向：计算机视觉；马培红（1998—），女，河南郑州人，硕士研究生，主要研究方向：计算机视觉；黑新宏（1976—），男，陕西延安人，教授，博士，CCF杰出会员，主要研究方向：机器学习。
基金资助:
国家自然科学基金资助项目（61976177）,陕西省重点研发计划项目（2022-GY-082）

Crowd counting method based on dual attention mechanism

ZHAO Zhiqiang^1,2, MA Peihong¹, HEI Xinhong^1,2

1. School of Computer Science and Engineering, Xi’an University of Technology 2. The Shaanxi Key Laboratory of Network Computing and Security Technology (Xi’an University of Technology)

Received:2023-09-15 Revised:2023-12-12 Online:2024-03-21 Published:2024-03-21
About author:ZHAO Zhiqiang, born in 1985, Ph. D., lecturer. His research interests include computer vision. MA Peihong, born in 1998, M. S. candidate. Her research interests include computer vision. HEI Xinhong, born in 1976, Ph. D., professor. His research interests include machine learning.
Supported by:
National Natural Science Foundation of China (61976177), Key R&D Project in Shaanxi Province of China under Grant (2023-YBGY-222)

摘要/Abstract

摘要： 针对复杂场景下人群计数问题中存在的尺度变化、背景干扰和部分遮挡等问题，在空洞卷积操作的基础上，提出一种基于双重注意力机制的空洞上下文卷积神经网络（DA-DCCNN）。首先，利用VGG16中的卷积层作为特征提取器，获取人群图像抽象、深层的特征图；然后，利用空洞卷积构造空洞上下文模块（DCM）对不同层获取的特征进行连接，并引入空间注意模块（SAM）和通道注意模块（CAM）获取上下文信息；最后，组合欧氏距离和交叉熵构造损失函数，对网络预测注意力图和真实注意力图之间的差异进行度量。对在ShanghaiTech、UCF_CC_50和UCF-QNRF三个公开数据集上的实验结果表明，DA-DCCNN能在有效获取图像的多尺度特征的同时，也可以增强对图像中重要区域和通道的感知能力，平均绝对误差（MAE）取得了相对最优的结果。可见，基于双重注意力机制的特征融合网络能有效感知图像中的空间结构和局部特征，从而使得生成的密度图能更准确地对人群区域进行预测和计数。

关键词: 空洞卷积, 上下文特征, 双重注意力机制, 密度图, 人群计数

Abstract: In response to challenges such as scale variations, background interference, and partial occlusion in crowd counting within complex scenes, a DA-DCCNN (Dual Attention based Dilated Contextual Convolutional Neural Network) was proposed. Firstly, the convolutional layers from VGG16 were utilized as feature extractors to obtain abstract and deep-level feature maps of crowd images. Subsequently, by employing dilated convolutions, a Dilation Context Module (DCM) was constructed to connect features obtained from different layers. The Spatial Attention Module (SAM) and Channel Attention Module (CAM) were introduced to acquire contextual information. Finally, a loss function was formulated by combining the Euclidean distance and cross-entropy to measure the disparity between the predicted attention map and the ground truth attention map. Experimental results on three publicly available datasets—ShanghaiTech, UCF_CC_50, and UCF-QNRF demonstrate that DA-DCCNN can effectively capture multi-scale features in images and enhance the perception of important regions and channels within the images. The Mean Absolute Error (MAE) achieves relatively optimal results. It can be seen that the feature fusion network based on the dual attention mechanism can efficiently recognize spatial structures and local features in images, leading to more accurate predictions and crowd area counts in the generated density maps.

Key words: dilated convolution, contextual feature, dual-attention mechanism, density map, crowd counting

中图分类号:

赵志强马培红黑新宏. 基于双重注意力机制的人群计数方法[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2023091269.

ZHAO Zhiqiang, MA Peihong, HEI Xinhong. Crowd counting method based on dual attention mechanism[J]. Journal of Computer Applications, DOI: 10.11772/j.issn.1001-9081.2023091269.

[1]	蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944.
[2]	梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618.
[3]	刘辉, 张琳玉, 王复港, 何如瑾. 基于注意力机制和上下文信息的目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1557-1564.
[4]	李佳东, 张丹普, 范亚琼, 杨剑锋. 基于改进YOLOv5的轻量级船舶目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 923-929.
[5]	张志昂, 廖光忠. 基于U-Net的多尺度特征增强视网膜血管分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3275-3281.
[6]	余晓鹏, 何儒汉, 黄晋, 张俊杰, 胡新荣. 基于改进Inception结构的知识图谱嵌入模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1065-1071.
[7]	刘朋伟, 高媛, 秦品乐, 殷喆, 王丽芳. 基于多感受野的生成对抗网络医学MRI影像超分辨率重建[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 938-945.
[8]	陈薪羽, 刘明哲, 任俊, 汤影. 基于多列卷积神经网络的参数异步更新算法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 395-403.
[9]	许慧青, 陈斌, 王敬飞, 陈志毅, 覃健. 基于卷积神经网络的细长路面病害检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 265-272.
[10]	冯兴杰, 张天泽. 基于分组卷积进行特征融合的全景分割算法[J]. 计算机应用, 2021, 41(7): 2054-2061.
[11]	胡屹杉, 秦品乐, 曾建潮, 柴锐, 王丽芳. 基于特征融合和动态多尺度空洞卷积的超声甲状腺分割网络[J]. 计算机应用, 2021, 41(3): 891-897.
[12]	杜培德, 严华. 基于多尺度空间注意力特征融合的人群计数网络[J]. 计算机应用, 2021, 41(2): 537-543.
[13]	付倩慧, 李庆奎, 傅景楠, 王羽. 基于空间维度循环感知网络的密集人群计数模型[J]. 计算机应用, 2021, 41(2): 544-549.
[14]	章悦, 张亮, 谢非, 杨嘉乐, 张瑞, 刘益剑. 基于实例分割模型优化的道路抛洒物检测算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3228-3233.
[15]	袁景凌, 丁远远, 潘东行, 李琳. 基于时序和上下文特征的中文隐式情感分类模型[J]. 计算机应用, 2021, 41(10): 2820-2828.

基于双重注意力机制的人群计数方法

Crowd counting method based on dual attention mechanism

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics