《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (5): 1236-1246.DOI: 10.11772/j.issn.1001-9081.2020081152

所属专题: 人工智能 综述

• 人工智能 • 上一篇    下一篇

面向自然语言处理任务的预训练模型综述

刘睿珩, 叶霞, 岳增营   

  1. 火箭军工程大学 作战保障学院, 西安 710025
  • 收稿日期:2020-08-03 修回日期:2020-11-15 发布日期:2020-12-09 出版日期:2021-05-10
  • 通讯作者: 刘睿珩
  • 作者简介:刘睿珩(1997-),男,陕西蓝田人,硕士研究生,主要研究方向:自然语言处理、数据分析;叶霞(1977-),女,江苏六合人,副教授,博士,主要研究方向:数据库、计算机网络;岳增营(1991-),男,山东济宁人,硕士研究生,主要研究方向:自然语言处理、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(62006240)。

Review of pre-trained models for natural language processing tasks

LIU Ruiheng, YE Xia, YUE Zengying   

  1. Academy of Combat Support, Rocket Force University of Engineering, Xi'an Shaanxi 710025, China
  • Received:2020-08-03 Revised:2020-11-15 Online:2020-12-09 Published:2021-05-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (62006240).

摘要: 近年来,深度学习技术得到了快速发展。在自然语言处理(NLP)任务中,随着文本表征技术从词级上升到了文档级,利用大规模语料库进行无监督预训练的方式已被证明能够有效提高模型在下游任务中的性能。首先,根据文本特征提取技术的发展,从词级和文档级对典型的模型进行了分析;其次,从预训练目标任务和下游应用两个阶段,分析了当前预训练模型的研究现状,并对代表性的模型特点进行了梳理和归纳;最后,总结了当前预训练模型发展所面临的主要挑战并提出了对未来的展望。

关键词: 自然语言处理, 预训练模型, 深度学习, 无监督学习, 神经网络

Abstract: In recent years, deep learning technology has developed rapidly. In Natural Language Processing (NLP) tasks, with text representation technology rising from the word level to the document level, the unsupervised pre-training method using a large-scale corpus has been proved to be able to effectively improve the performance of models in downstream tasks. Firstly, according to the development of text feature extraction technology, typical models were analyzed from word level and document level. Secondly, the research status of the current pre-trained models was analyzed from the two stages of pre-training target task and downstream application, and the characteristics of the representative models were summed up. Finally, the main challenges faced by the development of pre-trained models were summarized and the prospects were proposed.

Key words: Natural Language Processing (NLP), pre-trained model, deep learning, unsupervised learning, neural network

中图分类号: