《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (9): 2735-2740.DOI: 10.11772/j.issn.1001-9081.2022081295

• 人工智能 • 上一篇    下一篇

基于提示学习的小样本文本分类方法

于碧辉1,2, 蔡兴业1,2(), 魏靖烜1,2   

  1. 1.中国科学院大学,北京 100049
    2.中国科学院 沈阳计算技术研究所,沈阳 110168
  • 收稿日期:2022-09-05 修回日期:2022-12-09 接受日期:2023-01-03 发布日期:2023-02-28 出版日期:2023-09-10
  • 通讯作者: 蔡兴业
  • 作者简介:于碧辉(1982—),男,辽宁沈阳人,研究员,博士,主要研究方向:知识工程、大数据、语义网
    魏靖烜(1998—),男,山东泰安人,硕士研究生,主要研究方向:多模态分类与生成。
  • 基金资助:
    国家重点研发计划项目(2019YFB1405803)

Few-shot text classification method based on prompt learning

Bihui YU1,2, Xingye CAI1,2(), Jingxuan WEI1,2   

  1. 1.University of Chinese Academy of Sciences,Beijing 100049,China
    2.Shenyang Institute of Computing Technology,Chinese Academy of Sciences,Shenyang Liaoning 110168,China
  • Received:2022-09-05 Revised:2022-12-09 Accepted:2023-01-03 Online:2023-02-28 Published:2023-09-10
  • Contact: Xingye CAI
  • About author:YU Bihui, born in 1982, Ph. D., research fellow. His research interests include knowledge engineering, big data, semantic Web.
    WEI Jingxuan, born in 1998, M. S. candidate. His research interests include multimodal classification and generation.
  • Supported by:
    the National Key Research and Development Program of China(2019YFB1405803)

摘要:

文本分类任务通常依赖足量的标注数据,针对低资源场景下的分类模型在小样本上的过拟合问题,提出一种基于提示学习的小样本文本分类方法BERT-P-Tuning。首先,利用预训练模型BERT(Bidirectional Encoder Representations from Transformers)在标注样本上学习到最优的提示模板;然后,在每条样本中补充提示模板和空缺,将文本分类任务转化为完形填空任务;最后,通过预测空缺位置概率最高的词并结合它与标签之间的映射关系得到最终的标签。在公开数据集FewCLUE上的短文本分类任务上进行实验,实验结果表明,所提方法相较于基于BERT微调的方法在评价指标上有显著提高。所提方法在二分类任务上的准确率与F1值分别提升了25.2和26.7个百分点,在多分类任务上的准确率与F1值分别提升了6.6和8.0个百分点。相较于手动构建模板的PET(Pattern Exploiting Training)方法,所提方法在两个任务上的准确率分别提升了2.9和2.8个百分点,F1值分别提升了4.4和4.2个百分点,验证了预训练模型应用在小样本任务的有效性。

关键词: 小样本学习, 文本分类, 预训练模型, 提示学习, 自适应模板

Abstract:

Text classification tasks usually rely on sufficient labeled data. Concerning the over-fitting problem of classification models on samples with small size in low resource scenarios, a few-shot text classification method based on prompt learning called BERT-P-Tuning was proposed. Firstly, the pre-trained model BERT (Bidirectional Encoder Representations from Transformers) was used to learn the optimal prompt template from labeled samples. Then, the prompt template and vacancy were filled in each sample, and the text classification task was transformed into the cloze test task. Finally, the final labels were obtained by predicting the word with the highest probability of the vacant positions and combining the mapping relationship between it and labels. Experimental results on the short text classification tasks of public dataset FewCLUE show that the proposed method have significantly improved the evaluation indicators compared to the BERT fine-tuning based method. In specific, the proposed method has the accuracy and F1 score increased by 25.2 and 26.7 percentage points respectively on the binary classification task, and the proposed method has the accuracy and F1 score increased by 6.6 and 8.0 percentage points respectively on the multi-class classification task. Compared with the PET (Pattern Exploiting Training) method of constructing templates manually, the proposed method has the accuracy increased by 2.9 and 2.8 percentage points respectively on two tasks, and the F1 score increased by 4.4 and 4.2 percentage points respectively on two tasks. The above verifies the effectiveness of applying pre-trained model on few-shot tasks.

Key words: few-shot learning, text classification, pre-trained model, prompt learning, adaptive template

中图分类号: