基于多列卷积神经网络的参数异步更新算法

doi:10.11772/j.issn.1001-9081.2021020367

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (2): 395-403.DOI: 10.11772/j.issn.1001-9081.2021020367

所属专题：人工智能

基于多列卷积神经网络的参数异步更新算法

陈薪羽¹, 刘明哲¹(), 任俊², 汤影³

^1.地质灾害防治与地质环境保护国家重点实验室(成都理工大学), 成都 610059
^2.四川轻化工大学人工智能学院(自动化与信息工程学院), 四川自贡 644000
^3.成都理工大学计算机与网络安全学院(牛津布鲁克斯学院), 成都 610059

收稿日期:2021-03-11 修回日期:2021-07-16 接受日期:2021-07-20 发布日期:2022-02-11 出版日期:2022-02-10
通讯作者: 刘明哲
作者简介:陈薪羽（1996—），女，重庆人，硕士研究生，主要研究方向：深度学习、计算机视觉；
刘明哲（1970—），男，内蒙古赤峰人，教授，博士生导师，博士，CCF会员，主要研究方向：数据科学、网络安全；
任俊（1988—），男，四川成都人，博士，主要研究方向：自然语言处理、人工智能；
汤影（1978—），男，四川都江堰人，教授，博士，主要研究方向：机器学习、信息处理。
基金资助:
四川省科技创新苗子工程培育项目(2020021)

Parameter asynchronous updating algorithm based on multi-column convolutional neural network

Xinyu CHEN¹, Mingzhe LIU¹(), Jun REN², Ying TANG³

^1.State Key Laboratory of Geohazard Prevention and Geoenvironment Protection （Chengdu University of Technology），Chengdu Sichuan 610059，China
^2.College of Artificial Intelligence （College of Automation and Information Engineering），Sichuan University of Science & Engineering，Zigong Sichuan 644000，China
^3.College of Computer and Network Security （College of Oxford Brooks），Chengdu University of Technology，Chengdu Sichuan 610059，China

Received:2021-03-11 Revised:2021-07-16 Accepted:2021-07-20 Online:2022-02-11 Published:2022-02-10
Contact: Mingzhe LIU
About author:CHEN Xinyu， born in 1996， M. S. candidate. Her research interests include deep learning， computer vision.
LIU Mingzhe， born in 1970， Ph. D.， professor. His research interests include data science， network security.
REN Jun， born in 1988， Ph. D. His research interests include natural language processing， artificial intelligence.
TANG Ying， born in 1978， Ph. D.， professor. His research interests include machine learning， information processing.
Supported by:
Sichuan Science and Technology Innovative Seedling Project Cultivation Program(2020021)

摘要/Abstract

摘要：

针对现有人群计数算法采用同步人工优化深度学习网络，忽略了网络学习的负面信息，导致大量冗余参数甚至过拟合，进而影响到计数准确性的问题，提出基于多列卷积神经网络MCNN（Multi-column Convolution Neural Network）的参数异步更新算法。首先将单帧图像输入网络，经过三列卷积分别提取不同尺度特征，通过列之间的交互信息学习两列间特征图的关联性；接着，根据优化的交互信息及更新的损失函数异步更新每列参数直至算法收敛；最后采用动态卡尔曼滤波将每列输出密度图进行深度融合，并对融合的密度图中所有像素求和得到图像总人数。实验结果表明，所提算法在UCSD（University of California San Diego）数据集上的平均绝对误差（MAE）比该数据集上最优MAE表现的ic-CNN+McML（Iterative Crowd Counting Convolution Neural Network Multi-column Mutual Learning）减小了1.1%，均方误差（MSE）比该数据集上最优MSE表现的CP-CNN（Contextual Pyramid Convolution Neural Network）减小了4.3%；所提算法在ShanghaiTech Part_A数据集上的MAE比该数据集上最优MAE表现的ic-CNN+McML减小了1.7%，MSE比该数据集上最优MSE表现的ACSCP（Adversarial Cross-Scale Consistency Pursuit）减小了3.2%；在ShanghaiTech Part_B数据集上的MAE和MSE分别比该数据集上最优MAE和MSE表现的ic-CNN+McML减小了18.3%、35.2%；在UCF_CC_50（University of Central Florida Crowd Counting）数据集上的MAE和MSE分别比该数据集上最优MAE和MSE表现的ic-CNN+McML减小了1.9%、9.8%。可见，该算法能有效提高人群计数的准确性和鲁棒性，且允许输入图像具有任意大小或分辨率，能适应检测目标的大尺度变换。

关键词: 机器视觉, 深度学习, 卷积神经网络, 人群计数, 参数异步更新, 多尺度估计

Abstract:

To address the problem that the existing algorithm uses synchronous manual optimization of deep learning networks， and ignores the negative information of network learning， which leads to a large number of redundant parameters or even overfitting， thereby affecting the counting accuracy， a parameter asynchronous updating algorithm based on Multi-column Convolutional Neural Network （MCNN） was proposed. Firstly， a single frame image was input to the network， and after three columns of convolutions to extracting features with different scales respectively， the correlation of every two columns of feature maps was learned through the mutual information between columns. Then， the parameters of each column were updated asynchronously according to the optimized mutual information and the updated loss function until the algorithm converges. Finally， the dynamic Kalman filtering was used to deeply fuse the output density maps output by the columns， and all pixels in the fused density map were summed up to obtain the total number of people in the image. Experimental results show that on the UCSD （University of California San Diego） dataset， the Mean Absolute Error （MAE） of the proposed algorithm is 1.1% less than that of ic-CNN+McML （iterative crowd counting Convolution Neural Network Multi-column Multi-task Learning） with the best MAE performance on the dataset， and the Mean Square Error （MSE） of the proposed algorithm is 4.3% less than that of Contextual Pyramid Convolution Neural Network （CP-CNN） with the best MSE performance on the dataset； on the ShanghaiTech Part_A dataset， the MAE of the proposed algorithm is reduced by 1.7% compared to that of ic-CNN+McML with the best MAE performance on the dataset， and the MSE of the proposed algorithm is reduced by 3.2% compared to that of ACSCP （Adversarial Cross-Scale Consistency Pursuit）with the best MSE performance on the dataset； on the ShanghaiTech Part_B dataset， the proposed algorithm has the MAE and MSE reduced by 18.3% and 35.2% respectively compared to ic-CNN+McML with the best MAE and MSE performances on the dataset； on the UCF_CC_50 （University of Central Florida Crowd Counting） dataset， the proposed algorithm has the MAE and MSE reduced by 1.9% and 9.8% respectively compared to ic-CNN+McML with the best MAE and MSE performances on the dataset. The above shows that this algorithm can effectively improve the accuracy and robustness of crowd counting， and allows the input image to have any size or resolution， and can adapt to the large-scale transformation of the detected target.

Key words: machine vision, deep learning, Convolution Neural Network (CNN), crowd counting, parameter asynchronous updating, multi-scale estimation

中图分类号:

TP399

陈薪羽, 刘明哲, 任俊, 汤影. 基于多列卷积神经网络的参数异步更新算法[J]. 计算机应用, 2022, 42(2): 395-403.

Xinyu CHEN, Mingzhe LIU, Jun REN, Ying TANG. Parameter asynchronous updating algorithm based on multi-column convolutional neural network[J]. Journal of Computer Applications, 2022, 42(2): 395-403.

图/表 9

参考文献 23

1	LEMPITSKY V， ZISSERMAN A. Learning to count objects in images ［C］// Proceedings of the 23rd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2010： 1324-1332. 10.1109/iccvw.2011.6130376
2	李龑翔，汤国宝.对《关于加强公共安全视频监控建设联网应用工作的若干意见》的浅析［J］.中国公共安全， 2015（13）： 34-36. 10.18178/ijmlc.2021.11.1.1011
	LI Y X， TANG B G. Analysis of “several opinions on strengthening the networking application of public security video monitoring construction”［J］. China Public Security， 2015（13）： 34-36. 10.18178/ijmlc.2021.11.1.1011
3	BA L J， CARUANA R. Do deep nets really need to be deep？［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2654-2662.
4	ZHANG Y Y， ZHOU D S， CHEN S Q， et al. Single-image crowd counting via multi-column convolutional neural network ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 589-597. 10.1109/cvpr.2016.70
5	CHENG Z Q， LI J X， DAI Q， et al. Improving the learning of multi-column convolutional neural network for crowd counting ［C］// Proceedings of the 27th ACM International Conference on Multimedia. New York： ACM， 2019： 1897-1906. 10.1145/3343031.3350898
6	BOOMINATHAN L， KRUTHIVENTI S S， BABU R V. CrowdNet： a deep convolutional network for dense crowd counting ［C］// Proceedings of the 27th ACM International Conference on Multimedia. New York： ACM， 2016： 640-644. 10.1145/2964284.2967300
7	RANJAN V， LE H， HOAI M. Iterative crowd counting ［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS11211. Cham： Springer， 2018： 278-293.
8	HUANG S Y， LI X， ZHANG Z F， et al. Body structure aware deep crowd counting［J］. IEEE Transactions on Image Processing， 2018， 27（3）： 1049-1059. 10.1109/tip.2017.2740160
9	SINDAGI V A， PATEL V M. Generating high-quality crowd density maps using contextual pyramid CNNs ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 1879-1888. 10.1109/iccv.2017.206
10	SAM D B， SAJJAN N N， BABU R V， et al. Divide and grow： capturing huge diversity in crowd images with incrementally growing CNN ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3618-3626. 10.1109/cvpr.2018.00381
11	CAO X K， WANG Z P， ZHAO Y Y， et al. Scale aggregation network for accurate and efficient crowd counting ［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS11209. Cham： Springer， 2018： 757-773.
12	BUTTE A J， KOHANE I S. Mutual information relevance networks： functional genomic clustering using pairwise entropy measurements ［C］// Proceedings of the 2000 Pacific Symposium Pacific Symposium on Biocomputing. Singapore： World Scientific Publishing， 1999： 418-429. 10.1142/9789814447331_0040
13	BELGHAZI M I， BARATIN A， RAJESWAR S， et al. MINE： mutual information neural estimation ［C］// Proceedings of the 35th International Conference on Machine Learning. New York： JMLR.org， 2018： 531-540.
14	DONSKER M D， VARADHAN S R S. Asymptotic evaluation of certain Markova process expectations for large time-III［J］. Communications on Pure and Applied Mathematics， 1976， 29（4）： 389-461. 10.1002/cpa.3160290405
15	IDREES H， SALEEMI I， SEIBERT C， et al. Multi-source multi-scale counting in extremely dense crowd images ［C］// Proceedings of the 2013 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2013： 2547-2554. 10.1109/cvpr.2013.329
16	CHAN A B， LIANG Z S J， VASCONCELOS N. Privacy preserving crowd monitoring： counting people without people models or tracking ［C］// Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2008： 1-7. 10.1109/cvpr.2008.4587569
17	SINDAGI V A， PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting ［C］// Proceedings of the 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway： IEEE， 2017： 1-6. 10.1109/avss.2017.8078491
18	SHEN Z， XU Y， NI B B， et al. Crowd counting via adversarial cross-scale consistency pursuit ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 5245-5254. 10.1109/cvpr.2018.00550
19	ZHANG C， LI H S， WANG X G， et al. Cross-scene crowd counting via deep convolutional neural networks ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 833-841. 10.1109/cvpr.2015.7298684
20	LI Y H， ZHANG X F， CHEN D M. CSRNet： dilated convolutional neural networks for understanding the highly congested scenes ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1091-1100. 10.1109/cvpr.2018.00120
21	SAM D B， SURYA S， BABU R V. Switching convolutional neural network for crowd counting ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 4031-4039. 10.1109/cvpr.2017.429
22	SHI Z L， ZHANG L， LIU Y， et al. Crowd counting with deep negative correlation learning ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 5382-5390. 10.1109/cvpr.2018.00564
23	ZHANG L， SHI M J， CHEN Q B. Crowd counting via scale-adaptive convolutional neural network ［C］// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2018： 1113-1121. 10.1109/wacv.2018.00127

模型	MAE	MSE
CCNN（Crowd CNN）^［19］	465.90	499.80
C-MTL（Cascaded Multi-Task Learning）^［17］	411.80	561.90
IG-CNN（Incrementally Growing CNN）^［10］	323.70	399.20
CSRNet（Congested Scenes CNN）^［20］	268.90	398.40
SwitchCNN^［21］	321.60	443.40
CP-CNN^［9］	296.90	324.80
ic-CNN^［7］	261.80	366.40
ACSCP（Adversarial Cross-Scale Consistency Pursuit）^［18］	291.70	405.10
Deep-NCL（Negative Correlation Learning）^［22］	289.10	405.20
MCNN^［4］	378.90	510.40
SaCNN（Scale-adaptive CNN）^［23］	316.20	426.10
ic-CNN+McML^［5］	244.80	359.20
A-MCNN	242.10	310.80

模型	MAE	MSE
CCNN（Crowd CNN）^［19］	465.90	499.80
C-MTL（Cascaded Multi-Task Learning）^［17］	411.80	561.90
IG-CNN（Incrementally Growing CNN）^［10］	323.70	399.20
CSRNet（Congested Scenes CNN）^［20］	268.90	398.40
SwitchCNN^［21］	321.60	443.40
CP-CNN^［9］	296.90	324.80
ic-CNN^［7］	261.80	366.40
ACSCP（Adversarial Cross-Scale Consistency Pursuit）^［18］	291.70	405.10
Deep-NCL（Negative Correlation Learning）^［22］	289.10	405.20
MCNN^［4］	378.90	510.40
SaCNN（Scale-adaptive CNN）^［23］	316.20	426.10
ic-CNN+McML^［5］	244.80	359.20
A-MCNN	242.10	310.80

模型	ShanghaiTech Part_A		ShanghaiTech Part_B
模型	MAE	MSE	MAE	MSE
CCNN^［19］	185.30	280.50	34.20	51.30
C-MTL^［17］	103.90	155.70	20.90	32.60
IG-CNN^［10］	73.40	119.50	14.30	22.90
CSRNet^［20］	69.30	116.10	10.80	17.10
SwitchCNN^［21］	92.60	133.70	23.80	31.50
CP-CNN^［9］	74.80	107.90	21.20	31.80
ic-CNN^［7］	70.20	117.30	11.10	17.00
ACSCP^［18］	76.20	103.50	17.90	28.40
Deep-NCL^［22］	74.10	113.50	19.20	26.90
MCNN^［4］	111.50	175.60	27.10	42.20
SaCNN^［23］	87.20	140.70	16.90	26.30
ic-CNN+McML^［5］	64.20	112.30	10.40	14.20
A-MCNN	63.10	100.20	8.50	9.20

模型	ShanghaiTech Part_A		ShanghaiTech Part_B
模型	MAE	MSE	MAE	MSE
CCNN^［19］	185.30	280.50	34.20	51.30
C-MTL^［17］	103.90	155.70	20.90	32.60
IG-CNN^［10］	73.40	119.50	14.30	22.90
CSRNet^［20］	69.30	116.10	10.80	17.10
SwitchCNN^［21］	92.60	133.70	23.80	31.50
CP-CNN^［9］	74.80	107.90	21.20	31.80
ic-CNN^［7］	70.20	117.30	11.10	17.00
ACSCP^［18］	76.20	103.50	17.90	28.40
Deep-NCL^［22］	74.10	113.50	19.20	26.90
MCNN^［4］	111.50	175.60	27.10	42.20
SaCNN^［23］	87.20	140.70	16.90	26.30
ic-CNN+McML^［5］	64.20	112.30	10.40	14.20
A-MCNN	63.10	100.20	8.50	9.20

模型	MAE	MSE
CCNN^［19］	1.52	3.21
BSA-CNN^［8］	1.06	1.49
CSRNet^［20］	1.18	1.55
SwitchCNN^［21］	1.69	2.48
MCNN+McML^［5］	1.06	1.29
ic-CNN^［7］	1.17	1.49
CSRNet+McML^［5］	1.05	1.31
MCNN^［5］	1.09	1.39
ic-CNN+McML^［5］	1.05	1.23
A-MCNN	1.03	1.11

基于多列卷积神经网络的参数异步更新算法

Parameter asynchronous updating algorithm based on multi-column convolutional neural network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 23

相关文章 15

编辑推荐

Metrics

[1]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[2]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[3]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[4]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[5]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[6]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[7]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[8]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[9]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[10]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[11]	顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625.
[12]	石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650.
[13]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[14]	吴筝, 程志友, 汪真天, 汪传建, 王胜, 许辉. 基于深度学习的患者麻醉复苏过程中的头部运动幅度分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2258-2263.
[15]	王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994.