《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 849-855.DOI: 10.11772/j.issn.1001-9081.2024091325
马灿1,2, 黄瑞章1,2(), 任丽娜1,2,3, 白瑞娜1,2, 伍瑶瑶1,2
收稿日期:
2024-09-20
修回日期:
2024-12-11
接受日期:
2024-12-13
发布日期:
2025-02-13
出版日期:
2025-03-10
通讯作者:
黄瑞章
作者简介:
马灿(1992—),男,湖北鄂州人,硕士,主要研究方向:文本纠错、文本挖掘基金资助:
Can MA1,2, Ruizhang HUANG1,2(), Lina REN1,2,3, Ruina BAI1,2, Yaoyao WU1,2
Received:
2024-09-20
Revised:
2024-12-11
Accepted:
2024-12-13
Online:
2025-02-13
Published:
2025-03-10
Contact:
Ruizhang HUANG
About author:
MA Can, born in 1992, M. S. His research interests include text correction, text mining.Supported by:
摘要:
中文拼写纠错(CSC)是自然语言处理(NLP)中的一项重要研究任务,现有的基于大语言模型(LLM)的CSC方法由于LLM的生成机制,会生成和原文存在语义偏差的纠错结果。因此,提出基于LLM的多输入CSC方法。该方法包含多输入候选集合构建和LLM纠错两阶段:第一阶段将多个小模型的纠错结果构建为多输入候选集合;第二阶段使用LoRA(Low-Rank Adaptation)对LLM进行微调,即借助LLM的推理能力,在多输入候选集合中预测出没有拼写错误的句子作为最终的纠错结果。在公开数据集SIGHAN13、SIGHAN14、SIGHAN15和修正后的SIGHAN15上的实验结果表明,相较于使用LLM直接生成纠错结果的方法Prompt-GEN-1,所提方法的纠错F1值分别提升了9.6、24.9、27.9和34.2个百分点,相较于表现次优的纠错小模型,所提方法的纠错F1值分别提升了1.0、1.1、0.4和2.4个百分点,验证了所提方法能提升CSC任务的效果。
中图分类号:
马灿, 黄瑞章, 任丽娜, 白瑞娜, 伍瑶瑶. 基于大语言模型的多输入中文拼写纠错方法[J]. 计算机应用, 2025, 45(3): 849-855.
Can MA, Ruizhang HUANG, Lina REN, Ruina BAI, Yaoyao WU. Chinese spelling correction method based on LLM with multiple inputs[J]. Journal of Computer Applications, 2025, 45(3): 849-855.
提示词名称 | 提示词模板 |
---|---|
Prompt-GEN-1 | 请对以下句子进行拼写纠错: |
Prompt-GEN-2 | 对于待纠错句子: … 参考候选纠错结果,请对待纠错句子进行拼写纠错,输出纠错后的句子。 |
Prompt-SEL | 下列那个句子没有拼写错误,输出对应句子。若所有候选句子均有错误,则输出无拼写正确结果。 … |
表1 提示词模板
Tab. 1 Prompt templates
提示词名称 | 提示词模板 |
---|---|
Prompt-GEN-1 | 请对以下句子进行拼写纠错: |
Prompt-GEN-2 | 对于待纠错句子: … 参考候选纠错结果,请对待纠错句子进行拼写纠错,输出纠错后的句子。 |
Prompt-SEL | 下列那个句子没有拼写错误,输出对应句子。若所有候选句子均有错误,则输出无拼写正确结果。 … |
数据集 | 句子总数 | 平均句长 | 错误数 | |
---|---|---|---|---|
训练集 | SIGHAN13 | 700 | 41.8 | 343 |
SIGHAN14 | 3 437 | 49.6 | 5 122 | |
SIGHAN15 | 2 338 | 31.3 | 3 037 | |
Wang271K | 271 329 | 42.6 | 381 962 | |
测试集 | SIGHAN13 | 1 000 | 74.3 | 1 224 |
SIGHAN14 | 1 062 | 50.0 | 771 | |
SIGHAN15 | 1 100 | 30.6 | 703 | |
SIGHAN15-REVISED | 1 100 | 30.6 | 858 |
表2 训练数据集和测试数据集的统计信息
Tab. 2 Statistics of training and test datasets
数据集 | 句子总数 | 平均句长 | 错误数 | |
---|---|---|---|---|
训练集 | SIGHAN13 | 700 | 41.8 | 343 |
SIGHAN14 | 3 437 | 49.6 | 5 122 | |
SIGHAN15 | 2 338 | 31.3 | 3 037 | |
Wang271K | 271 329 | 42.6 | 381 962 | |
测试集 | SIGHAN13 | 1 000 | 74.3 | 1 224 |
SIGHAN14 | 1 062 | 50.0 | 771 | |
SIGHAN15 | 1 100 | 30.6 | 703 | |
SIGHAN15-REVISED | 1 100 | 30.6 | 858 |
数据集 | 方法 | 错误检测 | 错误纠正 | ||||||
---|---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | 准确率 | 精确率 | 召回率 | ||||
SIGHAN13 | DCN | 55.1 | 57.0 | 55.3 | 56.1 | 54.0 | 55.8 | 54.2 | 55.0 |
LEAD | 71.2 | 74.6 | 71.4 | 72.9 | 70.2 | 73.5 | 70.3 | 71.9 | |
REALISE | 69.0 | 73.2 | 69.1 | 71.1 | 67.6 | 71.6 | 67.7 | 69.6 | |
Prompt-GEN-1 | 63.8 | 71.1 | 63.1 | 66.9 | 60.5 | 67.3 | 59.7 | 63.3 | |
Prompt-GEN-2 | 66.7 | 67.7 | 66.5 | 67.1 | 64.9 | 65.8 | 64.7 | 65.2 | |
Prompt-SEL | 72.3 | 75.0 | 72.4 | 73.7 | 71.5 | 74.2 | 71.6 | 72.9 | |
SIGHAN14 | DCN | 69.6 | 54.3 | 61.0 | 57.4 | 69.2 | 53.6 | 60.2 | 56.7 |
LEAD | 75.9 | 63.9 | 67.5 | 65.7 | 75.0 | 62.3 | 65.8 | 64.0 | |
REALISE | 76.2 | 64.8 | 70.2 | 67.4 | 75.0 | 62.7 | 67.9 | 65.2 | |
Prompt-GEN-1 | 54.7 | 39.1 | 55.1 | 45.8 | 52.2 | 35.4 | 49.9 | 41.4 | |
Prompt-GEN-2 | 58.7 | 43.6 | 62.7 | 51.4 | 57.2 | 41.4 | 59.6 | 48.9 | |
Prompt-SEL | 76.4 | 64.7 | 72.3 | 68.3 | 75.3 | 62.8 | 70.2 | 66.3 | |
SIGHAN15 | DCN | 80.5 | 70.2 | 77.7 | 73.8 | 79.0 | 67.3 | 74.6 | 70.8 |
LEAD | 85.1 | 77.1 | 82.4 | 79.6 | 84.3 | 75.5 | 80.7 | 78.0 | |
REALISE | 83.7 | 76.2 | 80.9 | 78.5 | 83.1 | 75.0 | 79.6 | 77.2 | |
Prompt-GEN-1 | 67.2 | 52.7 | 64.2 | 57.9 | 63.2 | 46.0 | 56.0 | 50.5 | |
Prompt-GEN-2 | 71.9 | 59.5 | 80.3 | 68.4 | 70.5 | 57.3 | 77.4 | 65.8 | |
Prompt-SEL | 85.6 | 78.2 | 84.0 | 81.0 | 84.3 | 75.6 | 81.3 | 78.4 | |
SIGHAN15-REVISED | DCN | 76.4 | 70.5 | 68.3 | 69.4 | 75.2 | 68.6 | 66.5 | 67.5 |
LEAD | 76.1 | 70.1 | 67.0 | 68.5 | 75.5 | 68.9 | 65.9 | 67.3 | |
REALISE | 77.0 | 71.0 | 68.8 | 69.8 | 76.4 | 69.8 | 67.6 | 68.7 | |
Prompt-GEN-1 | 51.6 | 37.5 | 48.1 | 42.1 | 48.2 | 32.8 | 42.0 | 36.9 | |
Prompt-GEN-2 | 71.5 | 61.5 | 74.1 | 67.2 | 70.0 | 59.5 | 71.7 | 65.0 | |
Prompt-SEL | 77.7 | 75.3 | 70.1 | 72.6 | 76.9 | 73.7 | 68.6 | 71.1 |
表3 SIGHAN数据集上的拼写纠错效果 (%)
Tab. 3 Spelling error correction effects on SIGHAN datasets
数据集 | 方法 | 错误检测 | 错误纠正 | ||||||
---|---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | 准确率 | 精确率 | 召回率 | ||||
SIGHAN13 | DCN | 55.1 | 57.0 | 55.3 | 56.1 | 54.0 | 55.8 | 54.2 | 55.0 |
LEAD | 71.2 | 74.6 | 71.4 | 72.9 | 70.2 | 73.5 | 70.3 | 71.9 | |
REALISE | 69.0 | 73.2 | 69.1 | 71.1 | 67.6 | 71.6 | 67.7 | 69.6 | |
Prompt-GEN-1 | 63.8 | 71.1 | 63.1 | 66.9 | 60.5 | 67.3 | 59.7 | 63.3 | |
Prompt-GEN-2 | 66.7 | 67.7 | 66.5 | 67.1 | 64.9 | 65.8 | 64.7 | 65.2 | |
Prompt-SEL | 72.3 | 75.0 | 72.4 | 73.7 | 71.5 | 74.2 | 71.6 | 72.9 | |
SIGHAN14 | DCN | 69.6 | 54.3 | 61.0 | 57.4 | 69.2 | 53.6 | 60.2 | 56.7 |
LEAD | 75.9 | 63.9 | 67.5 | 65.7 | 75.0 | 62.3 | 65.8 | 64.0 | |
REALISE | 76.2 | 64.8 | 70.2 | 67.4 | 75.0 | 62.7 | 67.9 | 65.2 | |
Prompt-GEN-1 | 54.7 | 39.1 | 55.1 | 45.8 | 52.2 | 35.4 | 49.9 | 41.4 | |
Prompt-GEN-2 | 58.7 | 43.6 | 62.7 | 51.4 | 57.2 | 41.4 | 59.6 | 48.9 | |
Prompt-SEL | 76.4 | 64.7 | 72.3 | 68.3 | 75.3 | 62.8 | 70.2 | 66.3 | |
SIGHAN15 | DCN | 80.5 | 70.2 | 77.7 | 73.8 | 79.0 | 67.3 | 74.6 | 70.8 |
LEAD | 85.1 | 77.1 | 82.4 | 79.6 | 84.3 | 75.5 | 80.7 | 78.0 | |
REALISE | 83.7 | 76.2 | 80.9 | 78.5 | 83.1 | 75.0 | 79.6 | 77.2 | |
Prompt-GEN-1 | 67.2 | 52.7 | 64.2 | 57.9 | 63.2 | 46.0 | 56.0 | 50.5 | |
Prompt-GEN-2 | 71.9 | 59.5 | 80.3 | 68.4 | 70.5 | 57.3 | 77.4 | 65.8 | |
Prompt-SEL | 85.6 | 78.2 | 84.0 | 81.0 | 84.3 | 75.6 | 81.3 | 78.4 | |
SIGHAN15-REVISED | DCN | 76.4 | 70.5 | 68.3 | 69.4 | 75.2 | 68.6 | 66.5 | 67.5 |
LEAD | 76.1 | 70.1 | 67.0 | 68.5 | 75.5 | 68.9 | 65.9 | 67.3 | |
REALISE | 77.0 | 71.0 | 68.8 | 69.8 | 76.4 | 69.8 | 67.6 | 68.7 | |
Prompt-GEN-1 | 51.6 | 37.5 | 48.1 | 42.1 | 48.2 | 32.8 | 42.0 | 36.9 | |
Prompt-GEN-2 | 71.5 | 61.5 | 74.1 | 67.2 | 70.0 | 59.5 | 71.7 | 65.0 | |
Prompt-SEL | 77.7 | 75.3 | 70.1 | 72.6 | 76.9 | 73.7 | 68.6 | 71.1 |
方法 | 错误检测 | 错误纠正 | ||||||
---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | 准确率 | 精确率 | 召回率 | |||
文献[ | — | — | — | — | — | 62.5 | 53.4 | 57.6 |
Qwen1.5-14B(ZSP) | — | — | — | — | — | — | — | 28.1 |
Qwen1.5-14B(FSP) | — | — | — | — | — | — | — | 31.6 |
Prompt-GEN2(ZSP) | 41.4 | 28.5 | 40.2 | 33.3 | 41.0 | 28.0 | 39.5 | 32.8 |
Prompt-GEN2(FSP) | 47.6 | 37.3 | 54.4 | 44.2 | 47.6 | 37.3 | 54.4 | 44.2 |
Prompt-SEL(ZSP) | 68.7 | 69.1 | 55.3 | 61.5 | 68.0 | 67.7 | 54.2 | 60.2 |
Prompt-SEL(FSP) | 78.5 | 73.6 | 71.8 | 72.7 | 77.8 | 72.3 | 70.6 | 71.4 |
表4 SIGHAN15-REVISED数据集上的零样本或少样本学习拼写纠错效果 (%)
Tab. 4 Spelling error correction effects of zero-shot and few-shot learning on SIGHAN15-REVISED dataset
方法 | 错误检测 | 错误纠正 | ||||||
---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | 准确率 | 精确率 | 召回率 | |||
文献[ | — | — | — | — | — | 62.5 | 53.4 | 57.6 |
Qwen1.5-14B(ZSP) | — | — | — | — | — | — | — | 28.1 |
Qwen1.5-14B(FSP) | — | — | — | — | — | — | — | 31.6 |
Prompt-GEN2(ZSP) | 41.4 | 28.5 | 40.2 | 33.3 | 41.0 | 28.0 | 39.5 | 32.8 |
Prompt-GEN2(FSP) | 47.6 | 37.3 | 54.4 | 44.2 | 47.6 | 37.3 | 54.4 | 44.2 |
Prompt-SEL(ZSP) | 68.7 | 69.1 | 55.3 | 61.5 | 68.0 | 67.7 | 54.2 | 60.2 |
Prompt-SEL(FSP) | 78.5 | 73.6 | 71.8 | 72.7 | 77.8 | 72.3 | 70.6 | 71.4 |
统计维度 | Prompt-GEN-1 | Prompt-GEN-2 |
---|---|---|
长度不一(句子) | 88 | 73 |
修改字符数 | 801 | 887 |
修改错误字符数 | 368 | 300 |
表5 LLM纠错结果的字符级统计分析
Tab. 5 Character-level statistical analysis of LLM error correction results
统计维度 | Prompt-GEN-1 | Prompt-GEN-2 |
---|---|---|
长度不一(句子) | 88 | 73 |
修改字符数 | 801 | 887 |
修改错误字符数 | 368 | 300 |
准确率/% | 精确率/% | 召回率/% | F1值/% | |
---|---|---|---|---|
8 | 76.2 | 63.1 | 60.1 | 61.6 |
16 | 76.5 | 63.5 | 60.2 | 61.8 |
32 | 76.6 | 63.5 | 60.3 | 61.8 |
表6 在SIGHAN15数据集上LoRA参数对模型性能的影响
Tab. 6 Influence of LoRA parameter on model performance on SIGHAN15 dataset
准确率/% | 精确率/% | 召回率/% | F1值/% | |
---|---|---|---|---|
8 | 76.2 | 63.1 | 60.1 | 61.6 |
16 | 76.5 | 63.5 | 60.2 | 61.8 |
32 | 76.6 | 63.5 | 60.3 | 61.8 |
模型 | 纠错候选句 |
---|---|
待纠错句子 | 要求师公单位对项目进行垫资。 |
DCN | 要求市公单位对项目进行垫资。 |
LEAD | 要求施工单位对项目进行垫资。 |
REALISE | 要求示工单位对项目进行垫资。 |
Prompt-SEL | 要求施工单位对项目进行垫资。 |
表7 基于LLM多输入的纠错方法的正确纠错案例
Tab. 7 Correction case of method based on LLM with multiple inputs
模型 | 纠错候选句 |
---|---|
待纠错句子 | 要求师公单位对项目进行垫资。 |
DCN | 要求市公单位对项目进行垫资。 |
LEAD | 要求施工单位对项目进行垫资。 |
REALISE | 要求示工单位对项目进行垫资。 |
Prompt-SEL | 要求施工单位对项目进行垫资。 |
模型 | 纠错候选句 |
---|---|
待纠错句子 | 受到你的邮件,我一遍高兴一边难过。 |
DCN | 收到你的邮件,我一遍高兴一边难过。 |
LEAD | 受到你的邮件,我一边高兴一边难过。 |
REALISE | 受到你的邮件,我一遍高兴一边难过。 |
Prompt-SEL | 无拼写正确结果 |
表8 基于LLM的多输入纠错方法无法正确纠错的案例
Tab. 8 Failure correction case based on LLM with multiple inputs
模型 | 纠错候选句 |
---|---|
待纠错句子 | 受到你的邮件,我一遍高兴一边难过。 |
DCN | 收到你的邮件,我一遍高兴一边难过。 |
LEAD | 受到你的邮件,我一边高兴一边难过。 |
REALISE | 受到你的邮件,我一遍高兴一边难过。 |
Prompt-SEL | 无拼写正确结果 |
1 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
2 | CLARK K, LUONG M T, LE Q V, et al. ELECTRA: pre-training text encoders as discriminators rather than generators [EB/OL]. [2024-04-13]. . |
3 | ZHANG S, HUANG H, LIU J, et al. Spelling error correction with Soft-Masked BERT [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 882-890. |
4 | ZHU C, YING Z, ZHANG B, et al. MDCSpell: a multi-task detector-corrector framework for Chinese spelling correction [C]// Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: ACL, 2022: 1244-1253. |
5 | CHENG X, XU W, CHEN K, et al. SpellGCN: incorporating phonological and visual similarities into language models for Chinese spelling check [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 871-881. |
6 | WANG B, CHE W, WU D, et al. Dynamic connected networks for Chinese spelling check [C]// Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg: ACL, 2021: 2437-2446. |
7 | XU H D, LI Z, ZHOU Q, et al. Read, listen, and see: leveraging multimodal information helps Chinese spell checking [C]// Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg: ACL, 2021: 716-728. |
8 | GANAIE M A, HU M, MALIK A K, et al. Ensemble deep learning: a review [J]. Engineering Applications of Artificial Intelligence, 2022, 115: No.105151. |
9 | KILICOGLU H, FISZMAN M, ROBERTS K, et al. An ensemble method for spelling correction in consumer health questions [C]// Proceedings of the 2015 AMIA Annual Symposium. Bethesda, MD: AMIA, 2015: 727-736. |
10 | TANG C, WU X, WU Y. Are pre-trained language models useful for model ensemble in Chinese grammatical error correction? [C]// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg: ACL, 2023: 893-901. |
11 | OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 27730-27744. |
12 | FAN Y, JIANG F, LI P, et al. GrammarGPT: exploring open-source LLMs for native Chinese grammatical error correction with supervised fine-tuning [C]// Proceedings of the 2023 National CCF Conference on Natural Language Processing and Chinese Computing, LNCS 14304. Cham: Springer, 2023: 69-80. |
13 | BOROS E, EHRMANN M, ROMANELLO M, et al. Post-correction of historical text transcripts with large language models: an exploratory study [C]// Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature. Stroudsburg: ACL, 2024: 133-159. |
14 | DONG M, CHEN Y, ZHANG M, et al. Rich semantic knowledge enhanced large language models for few-shot Chinese spell checking [C]// Findings of the Association for Computational Linguistics: ACL 2024. Stroudsburg: ACL, 2024: 7372-7383. |
15 | ZHOU H, LI Z, ZHANG B, et al. A simple yet effective training-free prompt-free approach to Chinese spelling correction based on large language models [C]// Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2024: 17446-17467. |
16 | QIAO S, OU Y, ZHANG N, et al. Reasoning with language model prompting: a survey [C]// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2023: 5368-5393. |
17 | HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models [EB/OL]. [2024-02-14]. . |
18 | LIU C L, LAI M H, CHUANG Y H, et al. Visually and phonologically similar characters in incorrect simplified Chinese words [C]// Proceedings of the 23rd International Conference on Computational Linguistics: Posters Volume. [S.l.]: Coling 2010 Organizing Committee, 2010: 739-747. |
19 | LIU S, YANG T, YUE T, et al. PLOME: pre-training with misspelled knowledge for Chinese spelling correction [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 2991-3000. |
20 | ZHANG R, PANG C, ZHANG C, et al. Correcting Chinese spelling errors with phonetic pre-training [C]// Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg: ACL, 2021: 2250-2261. |
21 | MENG Y, WU W, WANG F, et al. Glyce: glyph-vectors for Chinese character representations [C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 2746-2757. |
22 | LI Y, MA S, ZHOU Q, et al. Learning from the dictionary: heterogeneous knowledge guided fine-tuning for Chinese spell checking [C]// Findings of the Association for Computational Linguistics: EMNLP 2022. Stroudsburg: ACL, 2022: 238-249. |
23 | 伍瑶瑶,黄瑞章,白瑞娜,等. 基于对比优化的多输入融合拼写纠错模型[J]. 模式识别与人工智能, 2024, 37(1):85-94. |
WU Y Y, HUANG R Z, BAI R N, et al. Multi-input fusion spelling error correction model based on contrast optimization [J]. Pattern Recognition and Artificial Intelligence, 2024, 37(1):85-94. | |
24 | WANG Y, WANG B, LIU Y, et al. LM-Combiner: a contextual rewriting model for Chinese grammatical error correction [C]// Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. [S.l.]: ELRA and ICCL, 2024: 10675-10685. |
25 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
26 | WU S H, LIU C L, LEE L H. Chinese spelling check evaluation at SIGHAN Bake-off 2013 [C]// Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. [S.l.]: Asian Federation of Natural Language Processing, 2013: 35-42. |
27 | YU L C, LEE L H, TSENG Y H, et al. Overview of SIGHAN 2014 Bake-off for Chinese spelling check [C]// Proceedings of the 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing. Stroudsburg: ACL, 2014: 126-132. |
28 | TSENG Y H, LEE L H, CHANG L P, et al. Introduction to SIGHAN 2015 Bake-off for Chinese spelling check [C]// Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing. Stroudsburg: ACL, 2015: 32-37. |
29 | YANG L, LIU X, LIAO T, et al. Is Chinese Spelling Check ready? understanding the correction behavior in real-world scenarios [J]. AI Open, 2023, 4: 183-192. |
30 | WANG D, SONG Y, LI J, et al. A hybrid approach to automatic corpus generation for Chinese spelling check [C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2018: 2517-2527. |
[1] | 鲁超峰, 陶冶, 文连庆, 孟菲, 秦修功, 杜永杰, 田云龙. 融合大语言模型和预训练模型的少量语料说话人-情感语音转换方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 815-822. |
[2] | 曹鹏, 温广琪, 杨金柱, 陈刚, 刘歆一, 季学纯. 面向测试用例生成的大模型高效微调方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 725-731. |
[3] | 秦小林, 古徐, 李弟诚, 徐海文. 大语言模型综述与展望[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 685-696. |
[4] | 袁成哲, 陈国华, 李丁丁, 朱源, 林荣华, 钟昊, 汤庸. ScholatGPT:面向学术社交网络的大语言模型及智能应用[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 755-764. |
[5] | 盛坤, 王中卿. 基于大语言模型和数据增强的通感隐喻分析[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 794-800. |
[6] | 徐月梅, 叶宇齐, 何雪怡. 大语言模型的偏见挑战:识别、评估与去除[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 697-708. |
[7] | 杨燕, 叶枫, 许栋, 张雪洁, 徐津. 融合大语言模型和提示学习的数字孪生水利知识图谱构建[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 785-793. |
[8] | 何静, 沈阳, 谢润锋. 大语言模型幻觉现象的识别与优化[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 709-714. |
[9] | 陈维, 施昌勇, 马传香. 基于多模态数据融合的农作物病害识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 840-848. |
[10] | 张学飞, 张丽萍, 闫盛, 侯敏, 赵宇博. 知识图谱与大语言模型协同的个性化学习推荐[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 773-784. |
[11] | 孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807. |
[12] | 董艳民, 林佳佳, 张征, 程程, 吴金泽, 王士进, 黄振亚, 刘淇, 陈恩红. 个性化学情感知的智慧助教算法设计与实践[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 765-772. |
[13] | 李斌, 林民, 斯日古楞null, 高颖杰, 王玉荣, 张树钧. 基于提示学习和全局指针网络的中文古籍实体关系联合抽取方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 75-81. |
[14] | 游新冬, 问英姿, 佘鑫鹏, 吕学强. 面向煤矿机电设备领域的三元组抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2026-2033. |
[15] | 余新言, 曾诚, 王乾, 何鹏, 丁晓玉. 基于知识增强和提示学习的小样本新闻主题分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1767-1774. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||