• •    

深度学习的可解释性研究综述

李凌敏1,侯梦然1,陈琨1,刘军民2   

  1. 1. 西安交通大学数学与统计学院
    2. 西安交通大学
  • 收稿日期:2021-09-22 修回日期:2022-01-09 发布日期:2022-04-15
  • 通讯作者: 刘军民
  • 基金资助:
    国家自然科学基金

Survey on Interpretability of Deep Learning

  • Received:2021-09-22 Revised:2022-01-09 Online:2022-04-15
  • Contact: Jun-Min liujunminLIU

摘要: 近年来,深度学习在很多领域得到广泛应用。然而,由于深度神经网络模型的高度非线性操作,导致其可解释性较差,并常常被称为“黑箱”模型,无法应用于一些对性能要求较高的关键领域。因此,对深度学习的可解释性开展研究是很有必要的。首先,简单介绍了深度学习,然后利用citespace对Web of Science数据库中的相关文献进行检索、分析和可视化,发现可解释性是近年来计算机科学和人工智能的一个热点研究方向。围绕深度学习的可解释性,从隐层可视化、类激活映射、敏感性分析、频率原理、鲁棒性扰动测试、信息论、可解释模块和优化方法八个方面对现有研究工作进行分析,展示了其在网络安全、推荐系统、医疗和社交网络领域的应用。最后讨论了深度学习可解释性研究存在的问题及未来的发展方向。

关键词: 深度学习, 可解释性, 隐层可视化, 类激活映射, 频率原理, 可解释模块, 信息论

Abstract: In recent years, deep learning has been widely used in many fields. However, due to the highly nonlinear operation of deep neural network model, its interpretability is poor, and it is often referred to as “black box” model, which can't be applied to some key fields with high performance requirements. Therefore, it is very necessary to study the interpretability of deep learning. Firstly, deep learning was briefly introduced. Then, CiteSpace visualization software was used to retrieve, analyze and visualize relevant literature in Web of Science database, and it was found that interpretability is a hot research direction of computer science and artificial intelligence in recent years. Around the interpretability of deep learning, the status quo was analyzed from eight aspects, including hidden layer visualization, class activation mapping, sensitivity analysis, frequency principle, robust disturbance test, information theory, interpretable module, optimization method, and show its application in network security, recommendation system, medical and social networking. Finally, the existing problems and future development direction of deep learning interpretability research were discussed.

Key words: deep learning, interpretability, visualization, class activation mapping, frequency principle, interpretable module, information theory

中图分类号: