Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (9): 2800-2806.DOI: 10.11772/j.issn.1001-9081.2021071216

• Advanced computing • Previous Articles    

Pooling algorithm based on Gaussian function

Yuhang WANG(), Yongxia ZHOU, Liangwu WU   

  1. College of Information Engineering,China Jiliang University,Hangzhou Zhejiang 310018,China
  • Received:2021-07-13 Revised:2021-09-21 Accepted:2021-09-24 Online:2021-10-18 Published:2022-09-10
  • Contact: Yuhang WANG
  • About author:ZHOU Yongxia, born in 1975, Ph. D., associate professor. His research interests include computer image and video processing, machine vision, artificial intelligence.
    WU Liangwu, born in 1995, M. S. candidate. His research interests include image processing, computer vision.
  • Supported by:
    Natural Science Foundation of Zhejiang Province(LY19F030013)

基于高斯函数的池化算法

王宇航(), 周永霞, 吴良武   

  1. 中国计量大学 信息工程学院,杭州 310018
  • 通讯作者: 王宇航
  • 作者简介:周永霞(1975—),男,浙江诸暨人,副教授,博士,主要研究方向:计算机图像视频处理、机器视觉、人工智能;
    吴良武(1995—),男,江西抚州人,硕士研究生,主要研究方向:图像处理、计算机视觉。
  • 基金资助:
    浙江省自然科学基金资助项目(LY19F030013)

Abstract:

Aiming at the problem that the traditional pooling algorithms in Convolutional Neural Network (CNN) cannot well consider the correlation between each element in the pooling domain and the features contained in the pooling domain, a pooling algorithm based on Gaussian function was proposed. Firstly, according to the value of each element in the pooling domain and the maximum value of all elements, the three parameter values of the Gaussian function were calculated. Then, the Gaussian function was used to calculate the weights of all elements in the pooling domain. Finally, the weighted average value of all elements in the pooling domain was calculated according to these weights. Finally, the obtained value was used as the pooling result. LeNet5, VGG (Visual Geometry Group)16, ResNet (Residual Network)18 and MobileNet v3 were selected as the experimental models. Experiments were carried out on public datasets CIFAR-10, Fer2013 and German Traffic Sign Recognition Benchmark (GTSRB), and max pooling, average pooling, random pooling, mixed pooling, fuzzy pooling, fused random pooling and soft pooling were selected to compare. Experimental results show that the proposed algorithm improves the accuracy by 0.5 percentage points to 6 percentage points compared with other algorithms on the three datasets, and the running efficiency of the proposed algorithm is higher than those of the other pooling algorithms except max pooling algorithm and average pooling algorithm, so as to verify that the proposed algorithm is effective and suitable for the situations where the operation time demand is not high but the accuracy demand is high.

Key words: Gaussian function, pooling, weighted average, Convolutional Neural Network (CNN), CIFAR-10, Fer2013, German Traffic Sign Recognition Benchmark (GTSRB)

摘要:

针对卷积神经网络(CNN)中的传统池化算法不能很好地考虑到池化域内每个元素与该池化域所含特征之间关联性的问题,提出一种基于高斯函数的池化算法。首先根据池化域内各元素的值和所有元素的最大值计算高斯函数的三个参数值,然后运用高斯函数计算池化域内所有元素的权重,最后根据这些权重对池化域内所有元素值计算加权平均值,并以此作为池化结果。选择LeNet5、VGG16、ResNet18和MobileNet v3作为实验模型,在公开数据集CIFAR-10、Fer2013和德国交通标志识别基准(GTSRB)上进行实验,并与最大池化、平均池化、随机池化、混合池化、模糊池化、融合随机池化和soft池化这七种池化算法进行对比。实验结果表明,所提算法在三个数据集上相较其他算法在精度方面均有0.5个百分点到6个百分点的提升,且在运行效率方面优于上述除最大池化和平均池化两种池化算法外的其他池化算法,从而验证所提算法有效且具适合应用于对运算时间要求不高但对精度要求较高的情况。

关键词: 高斯函数, 池化, 加权平均, 卷积神经网络, CIFAR-10, Fer2013, 德国交通标志识别基准

CLC Number: