Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (6): 1601-1606.DOI: 10.11772/j.issn.1001-9081.2018122501

• Artificial intelligence • Previous Articles     Next Articles

Improved convolution neural network model averaging method based on Dropout

CHENG Junhua, ZENG Guohui, LU Dunke, HUANG Bo   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2018-12-19 Revised:2019-02-26 Online:2019-06-17 Published:2019-06-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61603242), the Project of Jiangxi Provincial Economic Crime Collaborative Innovation Center of Prevention and Control Technology (JXJZXTCX-030).

基于Dropout的改进卷积神经网络模型平均方法

程俊华, 曾国辉, 鲁敦科, 黄勃   

  1. 上海工程技术大学 电子电气工程学院, 上海 201620
  • 通讯作者: 程俊华
  • 作者简介:程俊华(1994-),男,河南洛阳人,硕士研究生,主要研究方向:模式识别、计算机视觉;曾国辉(1975-),男,江西乐安人,副教授,博士,主要研究方向:智能控制、电力电子系统及其控制;鲁敦科(1983-),男,湖北咸宁人,讲师,博士,主要研究方向:光纤传感、特种光纤设计;黄勃(1985-),男,湖北武汉人,讲师,博士,主要研究方向:人工智能、大数据。
  • 基金资助:
    国家自然科学基金资助项目(61603242);江西省经济犯罪侦查与防控技术协同创新中心开放课题(JXJZXTCX-030)。

Abstract: In order to effectively solve the overfitting problem in deep Convolutional Neural Network (CNN), a model prediction averaging method based on Dropout improved CNN was proposed. Firstly, Dropout was employed in the pooling layers to sparse the unit values of pooling layers in the training phase. Then, in the testing phase, the probability of selecting unit value according to pooling layer Dropout was multiplied by the probability of each unit value in the pooling area as a double probability. Finally, the proposed double-probability weighted model averaging method was applied to the testing phase, so that the sparse effect of the pooling layer Dropout in the training phase was able to be better reflected on the pooling layer in the testing phase, thus achieving the low testing error as training result. The testing error rates of the proposed method in the given size network on MNIST and CIFAR-10 data sets were 0.31% and 11.23% respectively. The experimental results show that the improved method has lower error rate than Prob. weighted pooling and Stochastic Pooling method with only the impact of pooling layer on the results considered. It can be seen that the pooling layer Dropout makes the model more generalized and the pooling unit value is helpful for model generalization and can effectively avoid overfitting.

Key words: deep learning, Convolution Neural Network (CNN), Dropout regularization, overfitting, model averaging

摘要: 针对深度卷积神经网络(CNN)中的过拟合问题,提出一种基于Dropout改进CNN的模型预测平均方法。首先,训练阶段在池化层引入Dropout,使得池化层单元值具有稀疏性;然后,在测试阶段将训练时池化层Dropout选择单元值的概率与池化区域各单元值所占概率相乘作为双重概率;最后,将提出的双重概率加权的模型平均方法应用于测试阶段,使得训练阶段池化层Dropout的稀疏效果能够更好地反映到测试阶段池化层上,从而使测试错误率达到与训练的较低错误率相近的结果。在给定大小的网络中所提方法在MNIST和CIFAR-10数据集上的测试错误率分别为0.31%和11.23%。实验结果表明:仅考虑池化层对结果的影响,所提方法与Prob.weighted pooling和Stochastic Pooling方法相比具有更低的错误率,表明池化层Dropout使得模型更具泛化性,并且池化单元值对于模型泛化具有一定帮助,能够更有效避免过拟合。

关键词: 深度学习, 卷积神经网络, Dropout正则化, 过拟合, 模型平均

CLC Number: