《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (2): 395-403.DOI: 10.11772/j.issn.1001-9081.2021020367

• 人工智能 • 上一篇    

基于多列卷积神经网络的参数异步更新算法

陈薪羽1, 刘明哲1(), 任俊2, 汤影3   

  1. 1.地质灾害防治与地质环境保护国家重点实验室(成都理工大学), 成都 610059
    2.四川轻化工大学 人工智能学院(自动化与信息工程学院), 四川 自贡 644000
    3.成都理工大学 计算机与网络安全学院(牛津布鲁克斯学院), 成都 610059
  • 收稿日期:2021-03-11 修回日期:2021-07-16 接受日期:2021-07-20 发布日期:2021-08-04 出版日期:2022-02-10
  • 通讯作者: 刘明哲
  • 作者简介:陈薪羽(1996—),女,重庆人,硕士研究生,主要研究方向:深度学习、计算机视觉;
    刘明哲(1970—),男,内蒙古赤峰人,教授,博士生导师,博士,CCF会员,主要研究方向:数据科学、网络安全;
    任俊(1988—),男,四川成都人,博士,主要研究方向:自然语言处理、人工智能;
    汤影(1978—),男,四川都江堰人,教授,博士,主要研究方向:机器学习、信息处理。
  • 基金资助:
    四川省科技创新苗子工程培育项目(2020021)

Parameter asynchronous updating algorithm based on multi-column convolutional neural network

Xinyu CHEN1, Mingzhe LIU1(), Jun REN2, Ying TANG3   

  1. 1.State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology),Chengdu Sichuan 610059,China
    2.College of Artificial Intelligence (College of Automation and Information Engineering),Sichuan University of Science & Engineering,Zigong Sichuan 644000,China
    3.College of Computer and Network Security (College of Oxford Brooks),Chengdu University of Technology,Chengdu Sichuan 610059,China
  • Received:2021-03-11 Revised:2021-07-16 Accepted:2021-07-20 Online:2021-08-04 Published:2022-02-10
  • Contact: Mingzhe LIU
  • About author:CHEN Xinyu, born in 1996, M. S. candidate. Her research interests include deep learning, computer vision.
    LIU Mingzhe, born in 1970, Ph. D., professor. His research interests include data science, network security.
    REN Jun, born in 1988, Ph. D. His research interests include natural language processing, artificial intelligence.
    TANG Ying, born in 1978, Ph. D., professor. His research interests include machine learning, information processing.
  • Supported by:
    Sichuan Science and Technology Innovative Seedling Project Cultivation Program(2020021)

摘要:

针对现有人群计数算法采用同步人工优化深度学习网络,忽略了网络学习的负面信息,导致大量冗余参数甚至过拟合,进而影响到计数准确性的问题,提出基于多列卷积神经网络MCNN(Multi-column Convolution Neural Network)的参数异步更新算法。首先将单帧图像输入网络,经过三列卷积分别提取不同尺度特征,通过列之间的交互信息学习两列间特征图的关联性;接着,根据优化的交互信息及更新的损失函数异步更新每列参数直至算法收敛;最后采用动态卡尔曼滤波将每列输出密度图进行深度融合,并对融合的密度图中所有像素求和得到图像总人数。实验结果表明,所提算法在UCSD(University of California San Diego)数据集上的平均绝对误差(MAE)比该数据集上最优MAE表现的ic-CNN+McML(Iterative Crowd Counting Convolution Neural Network Multi-column Mutual Learning)减小了1.1%,均方误差(MSE)比该数据集上最优MSE表现的CP-CNN(Contextual Pyramid Convolution Neural Network)减小了4.3%;所提算法在ShanghaiTech Part_A数据集上的MAE比该数据集上最优MAE表现的ic-CNN+McML减小了1.7%,MSE比该数据集上最优MSE表现的ACSCP(Adversarial Cross-Scale Consistency Pursuit)减小了3.2%;在ShanghaiTech Part_B数据集上的MAE和MSE分别比该数据集上最优MAE和MSE表现的ic-CNN+McML减小了18.3%、35.2%;在UCF_CC_50(University of Central Florida Crowd Counting)数据集上的MAE和MSE分别比该数据集上最优MAE和MSE表现的ic-CNN+McML减小了1.9%、9.8%。可见,该算法能有效提高人群计数的准确性和鲁棒性,且允许输入图像具有任意大小或分辨率,能适应检测目标的大尺度变换。

关键词: 机器视觉, 深度学习, 卷积神经网络, 人群计数, 参数异步更新, 多尺度估计

Abstract:

To address the problem that the existing algorithm uses synchronous manual optimization of deep learning networks, and ignores the negative information of network learning, which leads to a large number of redundant parameters or even overfitting, thereby affecting the counting accuracy, a parameter asynchronous updating algorithm based on Multi-column Convolutional Neural Network (MCNN) was proposed. Firstly, a single frame image was input to the network, and after three columns of convolutions to extracting features with different scales respectively, the correlation of every two columns of feature maps was learned through the mutual information between columns. Then, the parameters of each column were updated asynchronously according to the optimized mutual information and the updated loss function until the algorithm converges. Finally, the dynamic Kalman filtering was used to deeply fuse the output density maps output by the columns, and all pixels in the fused density map were summed up to obtain the total number of people in the image. Experimental results show that on the UCSD (University of California San Diego) dataset, the Mean Absolute Error (MAE) of the proposed algorithm is 1.1% less than that of ic-CNN+McML (iterative crowd counting Convolution Neural Network Multi-column Multi-task Learning) with the best MAE performance on the dataset, and the Mean Square Error (MSE) of the proposed algorithm is 4.3% less than that of Contextual Pyramid Convolution Neural Network (CP-CNN) with the best MSE performance on the dataset; on the ShanghaiTech Part_A dataset, the MAE of the proposed algorithm is reduced by 1.7% compared to that of ic-CNN+McML with the best MAE performance on the dataset, and the MSE of the proposed algorithm is reduced by 3.2% compared to that of ACSCP (Adversarial Cross-Scale Consistency Pursuit)with the best MSE performance on the dataset; on the ShanghaiTech Part_B dataset, the proposed algorithm has the MAE and MSE reduced by 18.3% and 35.2% respectively compared to ic-CNN+McML with the best MAE and MSE performances on the dataset; on the UCF_CC_50 (University of Central Florida Crowd Counting) dataset, the proposed algorithm has the MAE and MSE reduced by 1.9% and 9.8% respectively compared to ic-CNN+McML with the best MAE and MSE performances on the dataset. The above shows that this algorithm can effectively improve the accuracy and robustness of crowd counting, and allows the input image to have any size or resolution, and can adapt to the large-scale transformation of the detected target.

Key words: machine vision, deep learning, Convolution Neural Network (CNN), crowd counting, parameter asynchronous updating, multi-scale estimation

中图分类号: