基于知识增强和提示学习的小样本新闻主题分类方法

doi:10.11772/j.issn.1001-9081.2023050709

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (6): 1767-1774.DOI: 10.11772/j.issn.1001-9081.2023050709

所属专题：人工智能

基于知识增强和提示学习的小样本新闻主题分类方法

余新言¹, 曾诚¹^,²^,³(), 王乾¹, 何鹏²^,³^,⁴, 丁晓玉¹

^1.湖北大学人工智能学院, 武汉 430062
^2.湖北省软件工程技术研究中心(湖北大学), 武汉 430062
^3.智慧政务与人工智能应用湖北省工程研究中心(湖北大学), 武汉 430062
^4.湖北大学网络空间安全学院, 武汉 430062

收稿日期:2023-06-06 修回日期:2023-11-15 接受日期:2023-12-05 发布日期:2024-01-04 出版日期:2024-06-10
通讯作者: 曾诚
作者简介:余新言（1995—），女，湖北荆州人，硕士研究生，主要研究方向：自然语言处理、小样本学习
王乾（1999—），男，河南洛阳人，硕士研究生，CCF会员，主要研究方向：自然语言处理、文本分类
何鹏（1988—），男，江西宜春人，教授，博士，CCF专业会员，主要研究方向：软件质量分析、缺陷预测
丁晓玉（1998—），女，湖北孝感人，硕士研究生，主要研究方向：自然语言处理、知识图谱。
基金资助:
国家自然科学基金资助项目(62102136);湖北省重点研发计划项目(2021BAA188)

Few-shot news topic classification method based on knowledge enhancement and prompt learning

Xinyan YU¹, Cheng ZENG¹^,²^,³(), Qian WANG¹, Peng HE²^,³^,⁴, Xiaoyu DING¹

^1.School of Artificial Intelligence，Hubei University，Wuhan Hubei 430062，China
^2.Hubei Software Engineering Technology Research Center （Hubei University），Wuhan Hubei 430062，China
^3.Hubei Engineering Research Center of Intelligent Government Affairs and Artificial Intelligence Application （Hubei University），Wuhan Hubei 430062，China
^4.School of Cyber Science and Technology，Hubei University，Wuhan Hubei 430062，China

Received:2023-06-06 Revised:2023-11-15 Accepted:2023-12-05 Online:2024-01-04 Published:2024-06-10
Contact: Cheng ZENG
About author:YU Xinyan， born in 1995， M. S. candidate. Her research interests include natural language processing， few-shot learning.
WANG Qian， born in 1999， M. S. candidate. His research interests include natural language processing， text classification.
HE Peng， born in 1988， Ph. D.， professor. His research interests include software quality analysis， defect prediction.
DING Xiaoyu， born in 1998， M. S. candidate. Her research interests include natural language processing， knowledge graph.
Supported by:
National Natural Science Foundation of China(62102136);Key R&D Project in Hubei Province(2021BAA188)

摘要/Abstract

摘要：

基于预训练微调的分类方法通常需要大量带标注的数据，导致无法应用于小样本分类任务。因此，针对中文小样本新闻主题分类任务，提出一种基于知识增强和提示学习的分类方法KPL（Knowledge enhancement and Prompt Learning）。首先，利用预训练模型在训练集上学习最优的提示模板；其次，将提示模板与输入文本结合，使分类任务转化为完形填空任务；同时利用外部知识扩充标签词空间，丰富标签词的语义信息；最后，对预测的标签词与原始的标签进行映射。通过在THUCNews、SHNews和Toutiao这3个新闻数据集上进行随机采样，形成小样本训练集和验证集进行实验。实验结果表明，所提方法在上述数据集上的1-shot、5-shot、10-shot和20-shot任务上整体表现有所提升，尤其在1-shot任务上提升效果突出，与基线小样本分类方法相比，准确率分别提高了7.59、2.11和3.10个百分点以上，验证了KPL在小样本新闻主题分类任务上的有效性。

关键词: 新闻主题分类, 提示学习, 知识增强, 小样本学习, 文本分类

Abstract:

Classification methods based on fine-tuning pre-trained models usually require a large amount of annotated data， resulting in the inability to be used for few-shot classification tasks. Therefore， a Knowledge enhancement and Prompt Learning （KPL） method was proposed for Chinese few-shot news topic classification. Firstly， an optimal prompt template was learned from the training set by using a pre-trained model. Then the template was integrated with the input text， effectively transforming the classification task into a cloze-filling task， simultaneously external knowledge was utilized to expand the label word space， enhancing the semantic richness of label words. Finally， predicted label words were subsequently mapped back to the original labels. Experiments were conducted on a few-shot training set and a few-shot validation set randomly sampled from three news datasets， THUCNews， SHNews and Toutiao. The experimental results show that the proposed method improves the overall performance on the 1-shot， 5-shot， 10-shot and 20-shot tasks on the above datasets. Notably， a significant improvement is observed in the 1-shot task. Compared to baseline few-shot classification method， the accuracy increases by at least 7.59， 2.11 and 3.10 percentage points， respectively， confirming the effectiveness of KPL in few-shot news topic classification tasks.

Key words: news topic classification, prompt learning, knowledge enhancement, Few-Shot Learning (FSL), text classification

中图分类号:

TP391.1

余新言, 曾诚, 王乾, 何鹏, 丁晓玉. 基于知识增强和提示学习的小样本新闻主题分类方法[J]. 计算机应用, 2024, 44(6): 1767-1774.

Xinyan YU, Cheng ZENG, Qian WANG, Peng HE, Xiaoyu DING. Few-shot news topic classification method based on knowledge enhancement and prompt learning[J]. Journal of Computer Applications, 2024, 44(6): 1767-1774.

图/表 9

参考文献 35

1	DEVLIN J， CHANG M-W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［EB/OL］.（2019-05-24）［2023-05-13］..
2	LIU Y， OTT M， GOYAL N，et al.RoBERTa： a robustly optimized BERT pretraining approach ［EB/OL］. （2020-07-26）［2023-05-13］. .
3	RAFFEL C， SHAZEER N， ROBERTS A， et al. Exploring the limits of transfer learning with a unified text-to-text transformer ［J］. The Journal of Machine Learning Research， 2020， 21（1）： 5485-5551.
4	BROWN T B， MANN B， RYDER N， et al. Language models are few-shot learners［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 1877-1901.
5	SUN C， QIU X， XU Y， et al. How to fine-tune BERT for text classification？［C］// Proceedings of the 18th China National Conference on Chinese Computational Linguistics. Berlin： Springer， 2019： 194-206.
6	王乾，曾诚，何鹏，等.基于RoBERTa-RCNN和注意力池化的新闻主题文本分类［J/OL］.郑州大学学报（理学版）：1-8 ［2023-05-13］..
	WANG Q， ZENG C， HE P， et al. News topic text classification based on RoBERTa-RCNN and attention pooling ［J/OL］. Journal of Zhengzhou University （Natural Science Edition）：1-8 ［2023-05-13］..
7	OCH F J， GILDEA D， KHUDANPUR S， et al. A smorgasbord of features for statistical machine translation［C］// Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics： HLT-NAACL 2004. Stroudsberg： ACL， 2004： 161-168.
8	ZHANG Y， NIVRE J. Transition-based dependency parsing with rich non-local features ［C］// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2011： 188-193.
9	LIU P， YUAN W， FU J， et al. Pre-train， prompt， and predict： a systematic survey of prompting methods in natural language processing ［J］. ACM Computing Surveys， 2023， 55（9）： 195.
10	SCHICK T， SCHÜTZE H. Exploiting cloze-questions for few-shot text classification and natural language inference ［C］// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics： Main Volume. Stroudsburg： ACL， 2021： 255-269.
11	SCAO T L， RUSH A. How many data points is a prompt worth［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 2627-2636.
12	GAO T， FISCH A， CHEN D. Making pre-trained language models better few-shot learners［C］// Proceedings of the 59th Auunaul Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 3816-3830.
13	VASWANI A， SHAZEER N， PARMAR N，et al. Attention is all you need ［C］// Proceedings of the 31st International Conference of Neural Information Processing Systems. Red Hook，NY：Curran Associates Inc.， 2017：6000-6010.
14	RONG X. word2vec Parameter learning explained ［EB/OL］. （2014-11-11）［2023-05-13］. .
15	PENNINGTON J， SOCHER R， MANNING C. GloVe： global vectors for word representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsberg： ACL，2014： 1532-1543.
16	JIANG Z， XU F F， ARAKI J， et al. How can we know what language models know？［J］. Transactions of the Association for Computational Linguistics， 2020， 8： 423-438.
17	SHIN T， RAZEGHI Y， LOGAN IV R L， et al. AutoPrompt： eliciting knowledge from language models with automatically generated prompts［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsberg： ACL， 2020： 4222-4235.
18	LIU X， ZHENG Y， DU Z， et al. GPT understands， too ［EB/OL］. （2021-05-18）［2023-05-13］. .
19	LI X L， LIANG P. Prefix-tuning： optimizing continuous prompts for generation ［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 4582-4597.
20	HAMBARDZUMYAN K， KHACHATRIAN H， MAY J. WARP： word-level adversarial reprogramming ［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 4921-4933.
21	SCHICK T， SCHMID H， SCHÜTZE H. Automatically identifying words that can serve as labels for few-shot text classification［C］// Proceedings of the 28th International Conference on Computational Linguistics. ［S.l.］： International Committee on Computational Linguistics， 2020： 5569-5578.
22	WEI J， HUANG C， VOSOUGHI S， et al. Few-shot text classification with triplet networks， data augmentation， and curriculum learning ［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL，2021： 5493-5500.
23	MIYATO T， DAI A M， GOODFELLOW I. Adversarial training methods for semi-supervised text classification ［EB/OL］. （2016-05-25）［2023-05-13］..
24	CHEN J， YANG Z， YANG D. MixText： linguistically-informed interpolation of hidden space for semi-supervised text classification［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 2147-2157.
25	SUN Z， FAN C， SUN X，et al. Neural semi-supervised learning for text classification under large-scale pretraining ［EB/OL］. （2020-11-19）［2023-05-13］. .
26	熊伟，宫禹.基于元学习的不平衡少样本情况下的文本分类研究［J］. 中文信息学报， 2022， 36（1）：104-116.
	XIONG W， GONG Y. Text classification based on meta learning for unbalanced small samples［J］. Journal of Chinese Information Processing， 2022， 36（1）：104-116.
27	YAO H， WU Y-X， AL-SHEDIVAT M， et al. Knowledge-aware meta-learning for low-resource text classification ［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 1814-1821.
28	SCHICK T， SCHÜTZE H. It’s not just size that matters： small language models are also few-shot learners ［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 2339-2352.
29	于碧辉，蔡兴业，魏靖烜.基于提示学习的小样本文本分类方法［J］.计算机应用，2023，43（9）：2735-2740.
	YU B H， CAI X Y， WEI J X. Few-shot text classification method based on prompt learning［J］. Journal of Computer Applications，2023，43（9）：2735-2740..
30	HU S， DING N， WANG H， et al. Knowledgeable prompt-tuning： incorporating knowledge into prompt verbalizer for text classification ［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 2225-2240.
31	MENG Y， ZHANG Y， HUANG J， et al. Text classification using label names only： a language model self-training approach［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2020： 9006-9017.
32	PEREZ E， KIELA D， CHO K. True few-shot learning with language models ［J］. Advances in Neural Information Processing Systems， 2021， 34： 11054-11070.
33	CUI Y， CHE W， LIU T， et al. Pre-training with whole word masking for Chinese BERT［J］. IEEE/ACM Transactions on Audio， Speech， and Language Processing， 2021， 29： 3504-3514.
34	DING N， HU S， ZHAO W， et al. OpenPrompt： an open-source framework for prompt-learning［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics： System Demonstrations. Stroudsburg： ACL， 2022： 105-113.
35	LESTER B， AL-RFOU R， CONSTANT N. The power of scale for parameter-efficient prompt tuning ［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 3045-3059.

数据集	标签	标签词集
THUCNews	房地产	房地产，房产，房地产业
	金融	金银，金融业，金融市场
	教育	高考，考生，高中，文综
Toutiao	电竞	网络游戏，竞技，玩家，
	农业	第一产业，农林牧副渔，农林
	证券	出游，旅行，出行
SHNews	科技	高科技，高新技术，技术
	文化	文明，人文，文风
	旅游	出游，旅行，出行

数据集	标签	标签词集
THUCNews	房地产	房地产，房产，房地产业
	金融	金银，金融业，金融市场
	教育	高考，考生，高中，文综
Toutiao	电竞	网络游戏，竞技，玩家，
	农业	第一产业，农林牧副渔，农林
	证券	出游，旅行，出行
SHNews	科技	高科技，高新技术，技术
	文化	文明，人文，文风
	旅游	出游，旅行，出行

数据集	样本数			标签类别数
数据集	训练集	验证集	测试集	标签类别数
THUCNews	180 000	10 000	10 000	10
Toutiao	267 877	57 401	57 401	15
SHNews	22 699	5 764	5 755	12

数据集	样本数			标签类别数
数据集	训练集	验证集	测试集	标签类别数
THUCNews	180 000	10 000	10 000	10
Toutiao	267 877	57 401	57 401	15
SHNews	22 699	5 764	5 755	12

数据集	模板
THUCNews	这是一条［MASK］新闻：x
	［MASK］新闻：x
	x是［MASK］新闻
Toutiao	这是一条［MASK］新闻：x
	［MASK］新闻：x
	分类：［MASK］x
SHNews	这是一条［MASK］新闻：x
	［MASK］新闻：x
	主题：［MASK］x

基于知识增强和提示学习的小样本新闻主题分类方法

Few-shot news topic classification method based on knowledge enhancement and prompt learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 35

相关文章 15

编辑推荐

Metrics

k-shot	模型	THUCNews		Toutiao		SHNews
k-shot	模型	Acc	Macro_F1	Acc	Macro_F1	Acc	Macro_F1
1	FT	48.27	45.90	28.86	28.71	28.78	26.79
	Soft-verb	67.29	66.62	63.91	58.44	55.98	54.31
	Auto-verb	34.38	31.98	37.11	31.27	30.36	27.50
	PET	69.53	68.92	59.84	55.07	56.28	54.48
	Soft-prompt	65.00	64.22	61.00	55.66	47.69	46.41
	KPL	77.12	76.94	67.01	61.68	58.39	56.46
5	FT	78.72	78.67	67.94	68.08	57.72	58.28
	Soft-verb	81.97	81.77	73.04	67.03	65.67	65.34
	Auto-verb	76.47	75.52	68.58	62.84	58.78	58.45
	PET	82.23	82.06	72.58	66.87	65.48	65.27
	Soft-prompt	80.71	80.26	73.06	67.20	65.13	64.78
	KPL	82.95	82.78	74.02	68.04	66.08	65.80
10	FT	81.53	81.43	72.16	72.74	64.29	64.44
	Soft-verb	84.35	84.30	74.97	69.09	67.51	67.37
	Auto-verb	80.72	79.72	73.84	67.68	64.98	64.46
	PET	84.16	84.07	74.58	68.95	67.85	67.66
	Soft-prompt	84.95	84.92	75.00	68.87	66.36	65.88
	KPL	85.50	85.38	75.21	69.38	68.60	68.54
20	FT	83.88	83.89	74.88	75.15	66.36	66.53
	Soft-verb	86.19	86.12	76.55	70.57	69.30	69.26
	Auto-verb	81.78	79.95	76.50	70.19	68.15	67.81
	PET	86.67	86.60	75.70	69.72	68.82	68.91
	Soft-prompt	86.48	86.45	76.73	71.09	69.59	69.47
	KPL	87.04	86.96	77.21	71.24	70.31	70.16

k-shot	P-tuning	Know	Acc	Macro_F1
1	×	×	72.19	71.84
	√	×	73.56	73.07
	×	√	76.91	76.75
	√	√	77.12	76.94
5	×	×	81.29	81.01
	√	×	82.36	82.33
	×	√	82.33	82.14
	√	√	82.95	82.78
10	×	×	84.58	84.56
	√	×	85.61	85.53
	×	√	85.56	85.50
	√	√	85.50	85.38
20	×	×	86.63	86.61
	√	×	86.70	86.65
	×	√	86.99	86.94
	√	√	87.04	86.96

[1]	游新冬, 问英姿, 佘鑫鹏, 吕学强. 面向煤矿机电设备领域的三元组抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2026-2033.
[2]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[3]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[4]	魏超, 陈艳平, 王凯, 秦永彬, 黄瑞章. 基于掩码提示与门控记忆网络校准的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1713-1719.
[5]	余杭, 周艳玲, 翟梦鑫, 刘涵. 基于预训练模型与标签融合的文本分类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 709-714.
[6]	张家伟, 高冠东, 肖珂, 宋胜尊. 基于改进分层注意网络和TextCNN联合建模的暴力犯罪分级算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 403-410.
[7]	王楷天, 叶青, 程春雷. 基于异构图表示的中医电子病历分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 411-417.
[8]	高颖杰, 林民, 斯日古楞null, 李斌, 张树钧. 基于片段抽取原型网络的古籍文本断句标点提示学习方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3815-3822.
[9]	谢莉, 舒卫平, 耿俊杰, 王琼, 杨海麟. 结合加权原型和自适应张量子空间的小样本宫颈细胞分类[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3200-3208.
[10]	于碧辉, 蔡兴业, 魏靖烜. 基于提示学习的小样本文本分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2735-2740.
[11]	崔雨萌, 王靖亚, 刘晓文, 闫尚义, 陶知众. 融合注意力和裁剪机制的通用文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2396-2405.
[12]	姜钧舰, 刘达维, 刘逸凡, 任酉贵, 赵志滨. 基于孪生网络的小样本目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2325-2329.
[13]	杨森淇, 段旭良, 肖展, 郎松松, 李志勇. 基于ERNIE+DPCNN+BiGRU的农业新闻文本分类[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1461-1466.
[14]	张旭, 生龙, 张海芳, 田丰, 王巍. 基于标签混淆的院前急救文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1050-1055.
[15]	林呈宇, 王雷, 薛聪. 标签语义增强的弱监督文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 335-342.