Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (2): 331-343.DOI: 10.11772/j.issn.1001-9081.2023020166

• Artificial intelligence •    

Review of mean field theory for deep neural network

Mengmei YAN1,2,3, Dongping YANG2,3()   

  1. 1.School of Advanced Manufacturing,Fuzhou University,Quanzhou Fujian 362000,China
    2.Quanzhou Institute of Equipment Manufacturing,Haixi Institutes,Chinese Academy of Sciences,Quanzhou Fujian 362200,China
    3.Research Center for Human?Machine Augmented Intelligence,Zhejiang Lab,Hangzhou Zhejiang 311101,China
  • Received:2023-02-23 Revised:2023-04-17 Accepted:2023-04-20 Online:2023-08-14 Published:2024-02-10
  • Contact: Dongping YANG
  • About author:YAN Mengmei, born in 1997, M. S. candidate. Her research interests include machine learning, dynamical mean field.
  • Supported by:
    National Natural Science Foundation of China(12175242)

深度神经网络平均场理论综述

颜梦玫1,2,3, 杨冬平2,3()   

  1. 1.福州大学 先进制造学院, 福建 泉州 362000
    2.中国科学院海西研究院 泉州装备制造研究中心, 福建 泉州 362200
    3.之江实验室 混合增强智能研究中心, 杭州 311101
  • 通讯作者: 杨冬平
  • 作者简介:颜梦玫(1997—),女,四川泸州人,硕士研究生,主要研究方向: 机器学习、动力学平均场;
  • 基金资助:
    国家自然科学基金资助项目(12175242)

Abstract:

Mean Field Theory (MFT) provides profound insights to understand the operation mechanism of Deep Neural Network (DNN), which can theoretically guide the engineering design of deep learning. In recent years, more and more researchers have started to devote themselves into the theoretical study of DNN, and in particular, a series of works based on mean field theory have attracted a lot of attention. To this end, a review of researches related to mean field theory for deep neural networks was presented to introduce the latest theoretical findings in three basic aspects: initialization, training process, and generalization performance of deep neural networks. Specifically, the concepts, properties and applications of edge of chaos and dynamical isometry for initialization were introduced, the training properties of overparameter networks and their equivalence networks were analyzed, and the generalization performance of various network architectures were theoretically analyzed, reflecting that mean field theory is a very important basic theoretical approach to understand the mechanisms of deep neural networks. Finally, the main challenges and future research directions were summarized for the investigation of mean field theory in the initialization, training and generalization phases of DNN.

Key words: Deep Neural Network (DNN), dynamics, Mean Field Theory (MFT), stochastic initialization, generalization

摘要:

平均场理论(MFT)为理解深度神经网络(DNN)的运行机制提供了非常深刻的见解,可以从理论上指导深度学习的工程设计。近年来,越来越多的研究人员开始投入DNN的理论研究,特别是基于MFT的一系列工作引起人们的广泛关注。为此,对深度神经网络平均场理论相关的研究内容进行综述,主要从初始化、训练过程和泛化性能这三个基本方面介绍最新的理论研究成果。在此基础上,介绍了混沌边缘和动力等距初始化的相关概念、相关特性和具体应用,分析了过参数网络以及相关等价网络的训练特性,并对不同网络架构的泛化性能进行理论分析,体现了平均场理论是理解深度神经网络机理的非常重要的基本理论方法。最后,总结了深度神经网络中初始、训练和泛化阶段的平均场理论面临的主要挑战和未来研究方向。

关键词: 深度神经网络, 动力学, 平均场理论, 随机初始化, 泛化性

CLC Number: