《计算机应用》唯一官方网站

所属专题: 人工智能

• •    下一篇

基于知识增强和提示学习的小样本新闻主题分类方法

余新言,曾诚,王乾,何鹏,丁晓玉   

  1. 湖北大学
  • 收稿日期:2023-06-05 修回日期:2023-11-15 接受日期:2023-12-05 发布日期:2026-02-05 出版日期:2024-06-10
  • 通讯作者: 曾诚
  • 基金资助:
    知识图谱引导的内容可控的文本生成式隐写方法研究;基于工业大数据的智能生产关键技术研究;基于图计算的社会协作网络智能分析与可视化研究;城市公共交通司乘人员风险识别与预测系统研发与应用示范;区块链安全评测关键技术研究与应用

A Few-shot News Topic Classification Method Based on Knowledge Enhancement And Learning

  • Received:2023-06-05 Revised:2023-11-15 Accepted:2023-12-05 Online:2026-02-05 Published:2024-06-10

摘要: 摘 要: 基于预训练微调的分类方法通常需要大量带标注的数据,导致无法在小样本分类任务上使用。因此,针对中文小样本新闻主题分类任务,提出一种基于知识增强和提示学习的分类方法(KPL)。首先,利用预训练模型在训练集上学习到最优的提示模板,;然后其次,将提示模板与输入文本结合,使分类任务转化为完形填空任务;同时 利用外部知识来扩充标签词空间,丰富标签词的语义信息;最后,对预测的标签词与原始的标签进行映射。通过在对经过抽样形成的三个新闻数据集 THUCNews、SHNews、Toutiao三个新闻数据集 分别进行实验,实验结果表明,所提方法在上述数据集上的1-shot、5-shot、10-shot、20-shot任务上整体表现有所提升,尤其在1-shot任务上提升效果突出,与基线小样本分类方法相比,准确率分别提高了7.95、2.11和3.1百分点,验证了知识增强和提示学习在小样本新闻主题分类任务上的有效性。

关键词: 新闻主题分类, 提示学习, 知识增强, 小样本学习, 文本分类

Abstract: Abstract: Classification methods based on fine-tuning pre-trained models usually require a large amount of annotated data, resulting in the inability to be used for few-shot classification tasks. Therefore, a Knowledge-Enhanced and Learning (KPL) method was proposed for Chinese few-shot news topic classification. Firstrly, an optimal template was learned from the training set by using a pre-trained model. Then the template was integrated with the input text, effectively transforming the classification task into a cloze-filling task. External knowledge was simultaneously utilized to expand the label word space, enhancing the semantic richness of label words. Predicted label words were subsequently mapped back to the original labels. Experiments conducted on the THUCNews, SHNews, and Toutiao news datasets revealed improvements across the 1-shot, 5-shot, 10-shot, and 20-shot tasks. Notably, a significant improvement is observed in the 1-shot task. Compared to baseline few-shot classification methods, the accuracy rate increased by 7.95%, 2.11%, and 3.1% respectively, confirming the effectiveness of the knowledge enhancement and learning approach in few-shot news topic classification tasks.

Key words: news topic classification, learning, knowledge enhancement, few-shot learning, text classification

中图分类号: