基于高斯函数的池化算法

doi:10.11772/j.issn.1001-9081.2021071216

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (9): 2800-2806.DOI: 10.11772/j.issn.1001-9081.2021071216

• 先进计算 • 上一篇

基于高斯函数的池化算法

王宇航(), 周永霞, 吴良武

中国计量大学信息工程学院，杭州 310018

收稿日期:2021-07-13 修回日期:2021-09-21 接受日期:2021-09-24 发布日期:2021-10-18 出版日期:2022-09-10
通讯作者: 王宇航
作者简介:周永霞（1975—），男，浙江诸暨人，副教授，博士，主要研究方向：计算机图像视频处理、机器视觉、人工智能；
吴良武（1995—），男，江西抚州人，硕士研究生，主要研究方向：图像处理、计算机视觉。
基金资助:
浙江省自然科学基金资助项目(LY19F030013)

Pooling algorithm based on Gaussian function

Yuhang WANG(), Yongxia ZHOU, Liangwu WU

College of Information Engineering，China Jiliang University，Hangzhou Zhejiang 310018，China

Received:2021-07-13 Revised:2021-09-21 Accepted:2021-09-24 Online:2021-10-18 Published:2022-09-10
Contact: Yuhang WANG
About author:ZHOU Yongxia， born in 1975， Ph. D.， associate professor. His research interests include computer image and video processing， machine vision， artificial intelligence.
WU Liangwu， born in 1995， M. S. candidate. His research interests include image processing， computer vision.
Supported by:
Natural Science Foundation of Zhejiang Province(LY19F030013)

摘要/Abstract

摘要：

针对卷积神经网络（CNN）中的传统池化算法不能很好地考虑到池化域内每个元素与该池化域所含特征之间关联性的问题，提出一种基于高斯函数的池化算法。首先根据池化域内各元素的值和所有元素的最大值计算高斯函数的三个参数值，然后运用高斯函数计算池化域内所有元素的权重，最后根据这些权重对池化域内所有元素值计算加权平均值，并以此作为池化结果。选择LeNet5、VGG16、ResNet18和MobileNet v3作为实验模型，在公开数据集CIFAR-10、Fer2013和德国交通标志识别基准（GTSRB）上进行实验，并与最大池化、平均池化、随机池化、混合池化、模糊池化、融合随机池化和soft池化这七种池化算法进行对比。实验结果表明，所提算法在三个数据集上相较其他算法在精度方面均有0.5个百分点到6个百分点的提升，且在运行效率方面优于上述除最大池化和平均池化两种池化算法外的其他池化算法，从而验证所提算法有效且具适合应用于对运算时间要求不高但对精度要求较高的情况。

关键词: 高斯函数, 池化, 加权平均, 卷积神经网络, CIFAR-10, Fer2013, 德国交通标志识别基准

Abstract:

Aiming at the problem that the traditional pooling algorithms in Convolutional Neural Network （CNN） cannot well consider the correlation between each element in the pooling domain and the features contained in the pooling domain， a pooling algorithm based on Gaussian function was proposed. Firstly， according to the value of each element in the pooling domain and the maximum value of all elements， the three parameter values of the Gaussian function were calculated. Then， the Gaussian function was used to calculate the weights of all elements in the pooling domain. Finally， the weighted average value of all elements in the pooling domain was calculated according to these weights. Finally， the obtained value was used as the pooling result. LeNet5， VGG （Visual Geometry Group）16， ResNet （Residual Network）18 and MobileNet v3 were selected as the experimental models. Experiments were carried out on public datasets CIFAR-10， Fer2013 and German Traffic Sign Recognition Benchmark （GTSRB）， and max pooling， average pooling， random pooling， mixed pooling， fuzzy pooling， fused random pooling and soft pooling were selected to compare. Experimental results show that the proposed algorithm improves the accuracy by 0.5 percentage points to 6 percentage points compared with other algorithms on the three datasets， and the running efficiency of the proposed algorithm is higher than those of the other pooling algorithms except max pooling algorithm and average pooling algorithm， so as to verify that the proposed algorithm is effective and suitable for the situations where the operation time demand is not high but the accuracy demand is high.

Key words: Gaussian function, pooling, weighted average, Convolutional Neural Network (CNN), CIFAR-10, Fer2013, German Traffic Sign Recognition Benchmark (GTSRB)

中图分类号:

TP391.4

王宇航, 周永霞, 吴良武. 基于高斯函数的池化算法[J]. 计算机应用, 2022, 42(9): 2800-2806.

Yuhang WANG, Yongxia ZHOU, Liangwu WU. Pooling algorithm based on Gaussian function[J]. Journal of Computer Applications, 2022, 42(9): 2800-2806.

图/表 10

参考文献 18

1	HINTON G E， SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks［J］. Science， 2006， 313（5786）：504-507. 10.1126/science.1127647
2	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2012： 1097-1105.
3	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［EB/OL］. （2015-04-10）［2021-08-16］..
4	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
5	SZEGEDY C， LIU W， JIA Y Q， et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1-9. 10.1109/cvpr.2015.7298594
6	HUANG G， LIU Z， VAN DER MAATEN L， et al. Densely connected convolutional networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2261-2269. 10.1109/cvpr.2017.243
7	LEE C Y， GALLAGHER P， TU Z W. Generalizing pooling functions in CNNs： mixed， gated， and tree［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（4）：863-875. 10.1109/tpami.2017.2703082
8	ZEILER M D， FERGUS R. Stochastic pooling for regularization of deep convolutional neural networks［EB/OL］. （2013-01-16）［2021-08-16］. .
9	YU D J， WANG H L， CHEN P Q， et al. Mixed pooling for convolutional neural networks［C］// Proceedings of the 2014 International Conference on Rough Sets and Knowledge Technology， LNCS 8818. Cham： Springer， 2014： 364-375.
10	GONG Y C， WANG L W， GUO R Q， et al. Multi-scale orderless pooling of deep convolutional activation features［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 8695. Cham： Springer， 2014： 392-407.
11	SHARMA T， SINGH V， SUDHAKARAN S， et al. Fuzzy based pooling in convolutional neural network for image classification［C］// Proceedings of the 2019 IEEE International Conference on Fuzzy Systems. Piscataway： IEEE， 2019： 1-6. 10.1109/fuzz-ieee.2019.8859010
12	REYES I V P D， SISON A M， MEDINA R P. A novel fused random pooling method for convolutional neural network to improve image classification accuracy［C］// Proceedings of the IEEE 6th International Conference on Engineering Technologies and Applied Sciences. Piscataway： IEEE， 2019： 1-5. 10.1109/icetas48360.2019.9117323
13	WAN W T， CHEN J S， LI T P， et al. Information entropy based feature pooling for convolutional neural networks［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3404-3413. 10.1109/iccv.2019.00350
14	STERGIOU A， POPPE R， KALLIATAKIS G. Refining activation downsampling with softpool［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 10337-10346. 10.1109/iccv48922.2021.01019
15	LECUN Y， BOTTOU L， BENGIO Y， et al. Gradient-based learning applied to document recognition［J］. Proceedings of the IEEE， 1998， 86（11）：2278-2324. 10.1109/5.726791
16	ABOUELNAGA Y， ALI O S， RADY H， et al. CIFAR-10： KNN-based ensemble of classifiers［C］// Proceedings of the 2016 International Conference on Computational Science and Computational Intelligence. Piscataway： IEEE， 2016： 1192-1195. 10.1109/csci.2016.0225
17	GOODFELLOW I J， ERHAN D， LUC CARRIER P， et al. Challenges in representation learning： a report on three machine learning contests［J］. Neural Networks， 2015， 64：59-63. 10.1016/j.neunet.2014.09.005
18	韩习习，魏民，徐西义，等. 基于多特征融合的交通标志识别算法［J］. 计算机工程与应用， 2019， 55（18）：195-200.
	HAN X X， WEI M， XU X Y， et al. Traffic sign recognition algorithm based on multi-feature fusion［J］. Computer Engineering and Applications， 2019， 55（18）：195-200.

输入尺寸	操作	激活函数	步长
224×224×3	Conv2d，3×3	h-swish	2
112×112×16	Bneck，3×3	ReLU	2
56×56×16	Bneck，3×3	ReLU	2
28×28×24	Bneck，3×3	ReLU	1
28×28×24	Bneck，5×5	h-swish	2
14×14×40	Bneck，5×5	h-swish	1
14×14×40	Bneck，5×5	h-swish	1
14×14×40	Bneck，5×5	h-swish	1
14×14×48	Bneck，5×5	h-swish	1
14×14×48	Bneck，5×5	h-swish	2
7×7×96	Bneck，5×5	h-swish	1
7×7×96	Conv2d，1×1	h-swish	1
7×7×576	Pool，7×7	—	1
1×1×576	Conv2d 1×1	h-swish	1
1×1×1 024	Conv2d 1×1	—	1

输入尺寸	操作	激活函数	步长
224×224×3	Conv2d，3×3	h-swish	2
112×112×16	Bneck，3×3	ReLU	2
56×56×16	Bneck，3×3	ReLU	2
28×28×24	Bneck，3×3	ReLU	1
28×28×24	Bneck，5×5	h-swish	2
14×14×40	Bneck，5×5	h-swish	1
14×14×40	Bneck，5×5	h-swish	1
14×14×40	Bneck，5×5	h-swish	1
14×14×48	Bneck，5×5	h-swish	1
14×14×48	Bneck，5×5	h-swish	2
7×7×96	Bneck，5×5	h-swish	1
7×7×96	Conv2d，1×1	h-swish	1
7×7×576	Pool，7×7	—	1
1×1×576	Conv2d 1×1	h-swish	1
1×1×1 024	Conv2d 1×1	—	1

实验模型	Epoch
实验模型	20	40	60
LeNet5-最大池化	55.030	56.160	57.020
LeNet5-平均池化	50.160	53.010	54.790
LeNet5-随机池化	53.030	57.280	58.910
LeNet5-混合池化	54.860	57.900	61.810
LeNet5-模糊池化	55.790	59.010	62.390
LeNet5-融合随机池化	56.420	60.720	63.750
LeNet5-soft池化	57.920	61.750	64.960
LeNet5-本文算法	59.370	62.780	65.480
VGG16-最大池化	79.090	82.840	83.810
VGG16-平均池化	75.730	81.080	82.630
VGG16-随机池化	77.720	84.390	84.610
VGG16-混合池化	78.620	82.970	83.880
VGG16-模糊池化	79.860	83.910	84.860
VGG16-融合随机池化	80.040	84.100	84.570
VGG16-soft池化	81.850	84.160	84.850
VGG16-本文算法	82.760	84.650	85.610
ResNet18-最大池化	81.820	85.740	86.830
ResNet18-平均池化	82.540	86.400	87.420
ResNet18-随机池化	83.040	86.900	88.110
ResNet18-混合池化	82.940	86.630	87.870
ResNet18-模糊池化	82.180	85.970	88.170
ResNet18-融合随机池化	82.970	86.540	88.350
ResNet18-soft池化	83.090	87.650	89.200
ResNet18-本文算法	83.930	87.940	89.930
MobileNet-最大池化	77.040	80.890	82.750
MobileNet-平均池化	76.880	80.670	82.890
MobileNet-随机池化	78.940	82.270	84.000
MobileNet-混合池化	78.210	82.390	83.680
MobileNet-模糊池化	77.990	81.530	83.520
MobileNet-融合随机池化	79.010	82.470	84.140
MobileNet-soft池化	78.980	82.330	84.170
MobileNet-本文算法	79.100	82.990	84.720

实验模型	Epoch
实验模型	20	40	60
LeNet5-最大池化	55.030	56.160	57.020
LeNet5-平均池化	50.160	53.010	54.790
LeNet5-随机池化	53.030	57.280	58.910
LeNet5-混合池化	54.860	57.900	61.810
LeNet5-模糊池化	55.790	59.010	62.390
LeNet5-融合随机池化	56.420	60.720	63.750
LeNet5-soft池化	57.920	61.750	64.960
LeNet5-本文算法	59.370	62.780	65.480
VGG16-最大池化	79.090	82.840	83.810
VGG16-平均池化	75.730	81.080	82.630
VGG16-随机池化	77.720	84.390	84.610
VGG16-混合池化	78.620	82.970	83.880
VGG16-模糊池化	79.860	83.910	84.860
VGG16-融合随机池化	80.040	84.100	84.570
VGG16-soft池化	81.850	84.160	84.850
VGG16-本文算法	82.760	84.650	85.610
ResNet18-最大池化	81.820	85.740	86.830
ResNet18-平均池化	82.540	86.400	87.420
ResNet18-随机池化	83.040	86.900	88.110
ResNet18-混合池化	82.940	86.630	87.870
ResNet18-模糊池化	82.180	85.970	88.170
ResNet18-融合随机池化	82.970	86.540	88.350
ResNet18-soft池化	83.090	87.650	89.200
ResNet18-本文算法	83.930	87.940	89.930
MobileNet-最大池化	77.040	80.890	82.750
MobileNet-平均池化	76.880	80.670	82.890
MobileNet-随机池化	78.940	82.270	84.000
MobileNet-混合池化	78.210	82.390	83.680
MobileNet-模糊池化	77.990	81.530	83.520
MobileNet-融合随机池化	79.010	82.470	84.140
MobileNet-soft池化	78.980	82.330	84.170
MobileNet-本文算法	79.100	82.990	84.720

实验模型	Epoch
实验模型	10	20	30
LeNet5-最大池化	44.581	46.893	48.119
LeNet5-平均池化	43.299	44.887	47.033
LeNet5-随机池化	43.689	46.308	49.206
LeNet5-混合池化	45.302	48.590	48.760
LeNet5-模糊池化	47.181	48.760	51.156
LeNet5-融合随机池化	47.334	49.955	52.082
LeNet5-soft池化	48.878	50.165	52.674
LeNet5-本文算法	49.596	50.961	53.385
VGG16-最大池化	49.847	55.503	57.816
VGG16-平均池化	50.822	57.453	58.596
VGG16-随机池化	46.197	55.113	61.048
VGG16-混合池化	48.531	56.450	59.571
VGG16-模糊池化	51.420	57.156	60.960
VGG16-融合随机池化	52.147	57.098	61.082
VGG16-soft池化	53.461	57.385	61.904
VGG16-本文算法	53.803	58.011	62.435
ResNet18-最大池化	50.334	55.974	60.978
ResNet18-平均池化	50.348	56.293	60.841
ResNet18-随机池化	50.485	56.590	61.624
ResNet18-混合池化	51.240	56.961	62.544
ResNet18-模糊池化	51.026	57.476	63.098
ResNet18-融合随机池化	50.702	57.998	63.345
ResNet18-soft池化	51.395	58.132	64.333
ResNet18-本文算法	52.187	59.457	65.676
MobileNet-最大池化	50.485	54.755	58.080
MobileNet-平均池化	50.181	54.659	57.863
MobileNet-随机池化	51.006	54.802	58.240
MobileNet-混合池化	50.702	55.076	58.302
MobileNet-模糊池化	50.847	55.916	58.647
MobileNet-融合随机池化	51.017	55.715	58.461
MobileNet-soft池化	51.096	55.637	58.996
MobileNet-本文算法	51.028	56.450	59.739

基于高斯函数的池化算法

Pooling algorithm based on Gaussian function

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 18

相关文章 15

编辑推荐

Metrics

实验模型	Epoch
实验模型	5	10	15
LeNet5-最大池化	45.891	67.675	82.668
LeNet5-平均池化	45.142	68.324	82.443
LeNet5-随机池化	46.151	68.890	83.905
LeNet5-混合池化	45.801	67.712	83.494
LeNet5-模糊池化	46.246	68.567	84.099
LeNet5-融合随机池化	45.980	68.759	84.560
LeNet5-soft池化	46.721	69.000	84.318
LeNet5-本文算法	47.130	69.450	85.074
VGG16-最大池化	53.692	70.246	89.397
VGG16-平均池化	54.026	70.856	88.180
VGG16-随机池化	53.280	69.960	90.269
VGG16-混合池化	54.678	70.210	89.517
VGG16-模糊池化	54.024	71.538	90.837
VGG16-融合随机池化	55.099	72.986	90.920
VGG16-soft池化	54.689	73.223	91.205
VGG16-本文算法	55.568	74.642	92.167
ResNet18-最大池化	70.801	87.769	96.837
ResNet18-平均池化	71.220	87.387	97.100
ResNet18-随机池化	72.413	89.657	97.375
ResNet18-混合池化	71.070	88.165	96.998
ResNet18-模糊池化	72.142	90.120	97.554
ResNet18-融合随机池化	72.814	90.105	97.445
ResNet18-soft池化	72.909	91.025	97.869
ResNet18-本文算法	74.678	91.920	98.708
MobileNet-最大池化	64.142	84.249	94.070
MobileNet-平均池化	64.494	84.129	93.756
MobileNet-随机池化	65.046	85.078	94.801
MobileNet-混合池化	64.000	84.935	94.130
MobileNet-模糊池化	63.958	85.140	94.814
MobileNet-融合随机池化	65.176	85.394	94.732
MobileNet-soft池化	65.169	85.373	94.810
MobileNet-本文算法	65.373	85.589	95.336

池化算法	分辨率
池化算法	100×100	1 000×1 000	10 000×10 000
最大池化	0.093	4.675	307.083
平均池化	0.252	9.025	867.590
随机池化	0.444	12.446	1 268.931
混合池化	0.196	8.988	831.617
模糊池化	0.689	16.031	1 812.159
融合随机池化	0.631	15.296	1 604.357
soft池化	0.454	13.155	1 239.741
本文算法	0.372	11.387	1036.137

[1]	衡红军, 徐天宝. 基于多尺度卷积和门控机制的注意力情感分析模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2674-2679.
[2]	吕振虎, 许新征, 张芳艳. 基于挤压激励的轻量化注意力机制模块[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2353-2360.
[3]	靳华中, 张修洋, 叶志伟, 张闻其, 夏小鱼. 基于近似U型网络结构的图像去噪模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2571-2577.
[4]	徐成霞, 阎庆, 李腾, 苗开超. 基于联合注意力机制的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2578-2585.
[5]	张显杰, 张之明. 基于卷积神经网络和Transformer的手写体英文文本识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2394-2400.
[6]	程南江, 余贞侠, 陈琳, 乔贺辙. 基于领域自适应的多源多标签行人属性识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2401-2406.
[7]	邓杰航, 郭文权, 陈汉杰, 顾国生, 刘景建, 杜宇坤, 刘超, 康晓东, 赵建. 融合多尺度多头自注意力和在线难例挖掘的小样本硅藻检测[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2593-2600.
[8]	王震宇, 张雷, 高文彬, 权威铭. 基于渐进式神经网络架构搜索的人体运动识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2058-2064.
[9]	王海起, 王志海, 李留珂, 孔浩然, 王琼, 徐建波. 基于网格划分的城市短时交通流量时空预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2274-2280.
[10]	谭湘粤, 胡晓, 杨佳信, 向俊将. 基于递进式特征增强聚合的伪装目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2192-2200.
[11]	董宁, 程晓荣, 张铭泉. 基于物联网平台的动态权重损失函数入侵检测系统[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2118-2124.
[12]	刘万军, 王佳铭, 曲海成, 董利兵, 曹欣宇. 基于频谱空间域特征注意的音乐流派分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2072-2077.
[13]	苏珊, 张杨, 张冬雯. 基于深度学习的耦合度相关代码坏味检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1702-1707.
[14]	杨磊, 赵红东, 于快快. 基于多头注意力机制的端到端语音情感识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1869-1875.
[15]	廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1876-1883.