Journal of Computer Applications
Special Issue: 人工智能
Received:2023-06-05
Revised:2023-11-15
Accepted:2023-12-05
Online:2026-02-05
Published:2024-06-10
余新言,曾诚,王乾,何鹏,丁晓玉
通讯作者:
曾诚
基金资助:CLC Number:
余新言 曾诚 王乾 何鹏 丁晓玉. 基于知识增强和提示学习的小样本新闻主题分类方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2023050709.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023050709
| 数据集 | 标签 | 标签词集 |
|---|---|---|
| THUCNews | 房地产 | 房地产,房产,房地产业 |
| 金融 | 金银,金融业,金融市场 | |
| 教育 | 高考,考生,高中,文综 | |
| Toutiao | 电竞 | 网络游戏,竞技,玩家, |
| 农业 | 第一产业,农林牧副渔,农林 | |
| 证券 | 出游,旅行,出行 | |
| SHNews | 科技 | 高科技,高新技术,技术 |
| 文化 | 文明,人文,文风 | |
| 旅游 | 出游,旅行,出行 |
Tab.1 Examples of label word sets for different datasets
| 数据集 | 标签 | 标签词集 |
|---|---|---|
| THUCNews | 房地产 | 房地产,房产,房地产业 |
| 金融 | 金银,金融业,金融市场 | |
| 教育 | 高考,考生,高中,文综 | |
| Toutiao | 电竞 | 网络游戏,竞技,玩家, |
| 农业 | 第一产业,农林牧副渔,农林 | |
| 证券 | 出游,旅行,出行 | |
| SHNews | 科技 | 高科技,高新技术,技术 |
| 文化 | 文明,人文,文风 | |
| 旅游 | 出游,旅行,出行 |
| 数据集 | 样本数 | 标签 类别数 | ||
|---|---|---|---|---|
| 训练集 | 验证集 | 测试集 | ||
| THUCNews | 180 000 | 10 000 | 10 000 | 10 |
| Toutiao | 267 877 | 57 401 | 57 401 | 15 |
| SHNews | 22 699 | 5 764 | 5 755 | 12 |
Tab.2 Statistics of experiment datasets
| 数据集 | 样本数 | 标签 类别数 | ||
|---|---|---|---|---|
| 训练集 | 验证集 | 测试集 | ||
| THUCNews | 180 000 | 10 000 | 10 000 | 10 |
| Toutiao | 267 877 | 57 401 | 57 401 | 15 |
| SHNews | 22 699 | 5 764 | 5 755 | 12 |
| 数据集 | 模板 |
|---|---|
| THUCNews | 这是一条[MASK]新闻:x |
| [MASK]新闻:x | |
| x是[MASK]新闻 | |
| Toutiao | 这是一条[MASK]新闻:x |
| [MASK]新闻:x | |
| 分类:[MASK]x | |
| SHNews | 这是一条[MASK]新闻:x |
| [MASK]新闻:x | |
| 主题:[MASK]x |
Tab.3 Chinese templates for experiment datasets
| 数据集 | 模板 |
|---|---|
| THUCNews | 这是一条[MASK]新闻:x |
| [MASK]新闻:x | |
| x是[MASK]新闻 | |
| Toutiao | 这是一条[MASK]新闻:x |
| [MASK]新闻:x | |
| 分类:[MASK]x | |
| SHNews | 这是一条[MASK]新闻:x |
| [MASK]新闻:x | |
| 主题:[MASK]x |
| k-shot | 模型 | THUCNews | Toutiao | SHNews | |||
|---|---|---|---|---|---|---|---|
| Acc | Macro_F1 | Acc | Macro_F1 | Acc | Macro_F1 | ||
| 1 | FT | 48.27 | 45.90 | 28.86 | 28.71 | 28.78 | 26.79 |
| Soft-verb | 67.29 | 66.62 | 63.91 | 58.44 | 55.98 | 54.31 | |
| Auto-verb | 34.38 | 31.98 | 37.11 | 31.27 | 30.36 | 27.50 | |
| PET | 69.53 | 68.92 | 59.84 | 55.07 | 56.28 | 54.48 | |
| Soft-prompt | 65.00 | 64.22 | 61.00 | 55.66 | 47.69 | 46.41 | |
| KPL | 77.12 | 76.94 | 67.01 | 61.68 | 58.39 | 56.46 | |
| 5 | FT | 78.72 | 78.67 | 67.94 | 68.08 | 57.72 | 58.28 |
| Soft-verb | 81.97 | 81.77 | 73.04 | 67.03 | 65.67 | 65.34 | |
| Auto-verb | 76.47 | 75.52 | 68.58 | 62.84 | 58.78 | 58.45 | |
| PET | 82.23 | 82.06 | 72.58 | 66.87 | 65.48 | 65.27 | |
| Soft-prompt | 80.71 | 80.26 | 73.06 | 67.20 | 65.13 | 64.78 | |
| KPL | 82.95 | 82.78 | 74.02 | 68.04 | 66.08 | 65.80 | |
| 10 | FT | 81.53 | 81.43 | 72.16 | 72.74 | 64.29 | 64.44 |
| Soft-verb | 84.35 | 84.30 | 74.97 | 69.09 | 67.51 | 67.37 | |
| Auto-verb | 80.72 | 79.72 | 73.84 | 67.68 | 64.98 | 64.46 | |
| PET | 84.16 | 84.07 | 74.58 | 68.95 | 67.85 | 67.66 | |
| Soft-prompt | 84.95 | 84.92 | 75.00 | 68.87 | 66.36 | 65.88 | |
| KPL | 85.50 | 85.38 | 75.21 | 69.38 | 68.60 | 68.54 | |
| 20 | FT | 83.88 | 83.89 | 74.88 | 75.15 | 66.36 | 66.53 |
| Soft-verb | 86.19 | 86.12 | 76.55 | 70.57 | 69.30 | 69.26 | |
| Auto-verb | 81.78 | 79.95 | 76.50 | 70.19 | 68.15 | 67.81 | |
| PET | 86.67 | 86.60 | 75.70 | 69.72 | 68.82 | 68.91 | |
| Soft-prompt | 86.48 | 86.45 | 76.73 | 71.09 | 69.59 | 69.47 | |
| KPL | 87.04 | 86.96 | 77.21 | 71.24 | 70.31 | 70.16 | |
Tab.4 Experiment results of 1/5/10/20-shot text classification on different datasets
| k-shot | 模型 | THUCNews | Toutiao | SHNews | |||
|---|---|---|---|---|---|---|---|
| Acc | Macro_F1 | Acc | Macro_F1 | Acc | Macro_F1 | ||
| 1 | FT | 48.27 | 45.90 | 28.86 | 28.71 | 28.78 | 26.79 |
| Soft-verb | 67.29 | 66.62 | 63.91 | 58.44 | 55.98 | 54.31 | |
| Auto-verb | 34.38 | 31.98 | 37.11 | 31.27 | 30.36 | 27.50 | |
| PET | 69.53 | 68.92 | 59.84 | 55.07 | 56.28 | 54.48 | |
| Soft-prompt | 65.00 | 64.22 | 61.00 | 55.66 | 47.69 | 46.41 | |
| KPL | 77.12 | 76.94 | 67.01 | 61.68 | 58.39 | 56.46 | |
| 5 | FT | 78.72 | 78.67 | 67.94 | 68.08 | 57.72 | 58.28 |
| Soft-verb | 81.97 | 81.77 | 73.04 | 67.03 | 65.67 | 65.34 | |
| Auto-verb | 76.47 | 75.52 | 68.58 | 62.84 | 58.78 | 58.45 | |
| PET | 82.23 | 82.06 | 72.58 | 66.87 | 65.48 | 65.27 | |
| Soft-prompt | 80.71 | 80.26 | 73.06 | 67.20 | 65.13 | 64.78 | |
| KPL | 82.95 | 82.78 | 74.02 | 68.04 | 66.08 | 65.80 | |
| 10 | FT | 81.53 | 81.43 | 72.16 | 72.74 | 64.29 | 64.44 |
| Soft-verb | 84.35 | 84.30 | 74.97 | 69.09 | 67.51 | 67.37 | |
| Auto-verb | 80.72 | 79.72 | 73.84 | 67.68 | 64.98 | 64.46 | |
| PET | 84.16 | 84.07 | 74.58 | 68.95 | 67.85 | 67.66 | |
| Soft-prompt | 84.95 | 84.92 | 75.00 | 68.87 | 66.36 | 65.88 | |
| KPL | 85.50 | 85.38 | 75.21 | 69.38 | 68.60 | 68.54 | |
| 20 | FT | 83.88 | 83.89 | 74.88 | 75.15 | 66.36 | 66.53 |
| Soft-verb | 86.19 | 86.12 | 76.55 | 70.57 | 69.30 | 69.26 | |
| Auto-verb | 81.78 | 79.95 | 76.50 | 70.19 | 68.15 | 67.81 | |
| PET | 86.67 | 86.60 | 75.70 | 69.72 | 68.82 | 68.91 | |
| Soft-prompt | 86.48 | 86.45 | 76.73 | 71.09 | 69.59 | 69.47 | |
| KPL | 87.04 | 86.96 | 77.21 | 71.24 | 70.31 | 70.16 | |
| k-shot | P-tuning | Know | Acc | Macro_F1 |
|---|---|---|---|---|
| 1 | × | × | 72.19 | 71.84 |
| √ | × | 73.56 | 73.07 | |
| × | √ | 76.91 | 76.75 | |
| √ | √ | 77.12 | 76.94 | |
| 5 | × | × | 81.29 | 81.01 |
| √ | × | 82.36 | 82.33 | |
| × | √ | 82.33 | 82.14 | |
| √ | √ | 82.95 | 82.78 | |
| 10 | × | × | 84.58 | 84.56 |
| √ | × | 85.61 | 85.53 | |
| × | √ | 85.56 | 85.50 | |
| √ | √ | 85.50 | 85.38 | |
| 20 | × | × | 86.63 | 86.61 |
| √ | × | 86.70 | 86.65 | |
| × | √ | 86.99 | 86.94 | |
| √ | √ | 87.04 | 86.96 |
Tab. 5 Ablation experiment results on THNCNews
| k-shot | P-tuning | Know | Acc | Macro_F1 |
|---|---|---|---|---|
| 1 | × | × | 72.19 | 71.84 |
| √ | × | 73.56 | 73.07 | |
| × | √ | 76.91 | 76.75 | |
| √ | √ | 77.12 | 76.94 | |
| 5 | × | × | 81.29 | 81.01 |
| √ | × | 82.36 | 82.33 | |
| × | √ | 82.33 | 82.14 | |
| √ | √ | 82.95 | 82.78 | |
| 10 | × | × | 84.58 | 84.56 |
| √ | × | 85.61 | 85.53 | |
| × | √ | 85.56 | 85.50 | |
| √ | √ | 85.50 | 85.38 | |
| 20 | × | × | 86.63 | 86.61 |
| √ | × | 86.70 | 86.65 | |
| × | √ | 86.99 | 86.94 | |
| √ | √ | 87.04 | 86.96 |
| 1 | DEVLIN J, CHANG M-W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL].(2019-05-24)[2023-05-13].. |
| 2 | LIU Y, OTT M, GOYAL N,et al.RoBERTa: a robustly optimized BERT pretraining approach [EB/OL]. (2020-07-26)[2023-05-13]. . |
| 3 | RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer [J]. The Journal of Machine Learning Research, 2020, 21(1): 5485-5551. |
| 4 | BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1877-1901. |
| 5 | SUN C, QIU X, XU Y, et al. How to fine-tune BERT for text classification?[C]// Proceedings of the 18th China National Conference on Chinese Computational Linguistics. Berlin: Springer, 2019: 194-206. |
| 6 | 王乾,曾诚,何鹏,等.基于RoBERTa-RCNN和注意力池化的新闻主题文本分类[J/OL].郑州大学学报(理学版):1-8 [2023-05-13].. |
| WANG Q, ZENG C, HE P, et al. News topic text classification based on RoBERTa-RCNN and attention pooling [J/OL]. Journal of Zhengzhou University (Natural Science Edition):1-8 [2023-05-13].. | |
| 7 | OCH F J, GILDEA D, KHUDANPUR S, et al. A smorgasbord of features for statistical machine translation[C]// Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004. Stroudsberg: ACL, 2004: 161-168. |
| 8 | ZHANG Y, NIVRE J. Transition-based dependency parsing with rich non-local features [C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2011: 188-193. |
| 9 | LIU P, YUAN W, FU J, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing [J]. ACM Computing Surveys, 2023, 55(9): 195. |
| 10 | SCHICK T, SCHÜTZE H. Exploiting cloze-questions for few-shot text classification and natural language inference [C]// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Stroudsburg: ACL, 2021: 255-269. |
| 11 | SCAO T L, RUSH A. How many data points is a prompt worth[C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 2627-2636. |
| 12 | GAO T, FISCH A, CHEN D. Making pre-trained language models better few-shot learners[C]// Proceedings of the 59th Auunaul Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 3816-3830. |
| 13 | VASWANI A, SHAZEER N, PARMAR N,et al. Attention is all you need [C]// Proceedings of the 31st International Conference of Neural Information Processing Systems. Red Hook,NY:Curran Associates Inc., 2017:6000-6010. |
| 14 | RONG X. word2vec Parameter learning explained [EB/OL]. (2014-11-11)[2023-05-13]. . |
| 15 | PENNINGTON J, SOCHER R, MANNING C. GloVe: global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsberg: ACL,2014: 1532-1543. |
| 16 | JIANG Z, XU F F, ARAKI J, et al. How can we know what language models know?[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 423-438. |
| 17 | SHIN T, RAZEGHI Y, LOGAN IV R L, et al. AutoPrompt: eliciting knowledge from language models with automatically generated prompts[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsberg: ACL, 2020: 4222-4235. |
| 18 | LIU X, ZHENG Y, DU Z, et al. GPT understands, too [EB/OL]. (2021-05-18)[2023-05-13]. . |
| 19 | LI X L, LIANG P. Prefix-tuning: optimizing continuous prompts for generation [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 4582-4597. |
| 20 | HAMBARDZUMYAN K, KHACHATRIAN H, MAY J. WARP: word-level adversarial reprogramming [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 4921-4933. |
| 21 | SCHICK T, SCHMID H, SCHÜTZE H. Automatically identifying words that can serve as labels for few-shot text classification[C]// Proceedings of the 28th International Conference on Computational Linguistics. [S.l.]: International Committee on Computational Linguistics, 2020: 5569-5578. |
| 22 | WEI J, HUANG C, VOSOUGHI S, et al. Few-shot text classification with triplet networks, data augmentation, and curriculum learning [C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL,2021: 5493-5500. |
| 23 | MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification [EB/OL]. (2016-05-25)[2023-05-13].. |
| 24 | CHEN J, YANG Z, YANG D. MixText: linguistically-informed interpolation of hidden space for semi-supervised text classification[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 2147-2157. |
| 25 | SUN Z, FAN C, SUN X,et al. Neural semi-supervised learning for text classification under large-scale pretraining [EB/OL]. (2020-11-19)[2023-05-13]. . |
| 26 | 熊伟,宫禹.基于元学习的不平衡少样本情况下的文本分类研究[J]. 中文信息学报, 2022, 36(1):104-116. |
| XIONG W, GONG Y. Text classification based on meta learning for unbalanced small samples[J]. Journal of Chinese Information Processing, 2022, 36(1):104-116. | |
| 27 | YAO H, WU Y-X, AL-SHEDIVAT M, et al. Knowledge-aware meta-learning for low-resource text classification [C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 1814-1821. |
| 28 | SCHICK T, SCHÜTZE H. It’s not just size that matters: small language models are also few-shot learners [C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 2339-2352. |
| 29 | 于碧辉,蔡兴业,魏靖烜.基于提示学习的小样本文本分类方法[J].计算机应用,2023,43(9):2735-2740. |
| YU B H, CAI X Y, WEI J X. Few-shot text classification method based on prompt learning[J]. Journal of Computer Applications,2023,43(9):2735-2740.. | |
| 30 | HU S, DING N, WANG H, et al. Knowledgeable prompt-tuning: incorporating knowledge into prompt verbalizer for text classification [C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 2225-2240. |
| 31 | MENG Y, ZHANG Y, HUANG J, et al. Text classification using label names only: a language model self-training approach[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 9006-9017. |
| 32 | PEREZ E, KIELA D, CHO K. True few-shot learning with language models [J]. Advances in Neural Information Processing Systems, 2021, 34: 11054-11070. |
| 33 | CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514. |
| 34 | DING N, HU S, ZHAO W, et al. OpenPrompt: an open-source framework for prompt-learning[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Stroudsburg: ACL, 2022: 105-113. |
| 35 | LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning [C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 3045-3059. |
| [1] | Zhiyuan WANG, Tao PENG, Jie YANG. Integrating internal and external data for out-of-distribution detection training and testing [J]. Journal of Computer Applications, 2025, 45(8): 2497-2506. |
| [2] | Jing WANG, Jiaxing LIU, Wanying SONG, Jiaxing XUE, Wenxin DING. Few-shot skin image classification model based on spatial transformer network and feature distribution calibration [J]. Journal of Computer Applications, 2025, 45(8): 2720-2726. |
| [3] | Ruifeng BAI, Guanglei GOU, Lang WEN, Wanyu MIAO. Granular-ball prototypical network for few-shot image classification [J]. Journal of Computer Applications, 2025, 45(7): 2269-2277. |
| [4] | Mingfeng YU, Yongbin QIN, Ruizhang HUANG, Yanping CHEN, Chuan LIN. Multi-label text classification method based on contrastive learning enhanced dual-attention mechanism [J]. Journal of Computer Applications, 2025, 45(6): 1732-1740. |
| [5] | Xiangyu LI, Jingqiang CHEN. Comparability assessment and comparative citation generation method for scientific papers [J]. Journal of Computer Applications, 2025, 45(6): 1888-1894. |
| [6] | Shuangshuang CUI, Hongzhi WANG, Jiahao ZHU, Hao WU. Two-stage data selection method for classifier with low energy consumption and high performance [J]. Journal of Computer Applications, 2025, 45(6): 1703-1711. |
| [7] | Biqing ZENG, Guangbin ZHONG, James Zhiqing WEN. Few-shot named entity recognition based on decomposed fuzzy span [J]. Journal of Computer Applications, 2025, 45(5): 1504-1510. |
| [8] | Jiaxin LI, Site MO. Power work order classification in substation area based on MiniRBT-LSTM-GAT and label smoothing [J]. Journal of Computer Applications, 2025, 45(4): 1356-1362. |
| [9] | Haitao SUN, Jiayu LIN, Zuhong LIANG, Jie GUO. Data augmentation technique incorporating label confusion for Chinese text classification [J]. Journal of Computer Applications, 2025, 45(4): 1113-1119. |
| [10] | Yiheng SUN, Maofu LIU. Tender information extraction method based on prompt tuning of knowledge [J]. Journal of Computer Applications, 2025, 45(4): 1169-1176. |
| [11] | Yiqin YAN, Chuan LUO, Tianrui LI, Hongmei CHEN. Cross-domain few-shot classification model based on relation network and Vision Transformer [J]. Journal of Computer Applications, 2025, 45(4): 1095-1103. |
| [12] | Can MA, Ruizhang HUANG, Lina REN, Ruina BAI, Yaoyao WU. Chinese spelling correction method based on LLM with multiple inputs [J]. Journal of Computer Applications, 2025, 45(3): 849-855. |
| [13] | Yan YANG, Feng YE, Dong XU, Xuejie ZHANG, Jin XU. Construction of digital twin water conservancy knowledge graph integrating large language model and prompt learning [J]. Journal of Computer Applications, 2025, 45(3): 785-793. |
| [14] | Kun FU, Shicong YING, Tingting ZHENG, Jiajie QU, Jingyuan CUI, Jianwei LI. Graph data augmentation method for few-shot node classification [J]. Journal of Computer Applications, 2025, 45(2): 392-402. |
| [15] | Xuewen YAN, Zhangjin HUANG. Few-shot image classification method based on contrast learning [J]. Journal of Computer Applications, 2025, 45(2): 383-391. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||