基于多模态信息融合的中文拼写纠错算法

doi:10.11772/j.issn.1001-9081.2024050628

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1528-1534.DOI: 10.11772/j.issn.1001-9081.2024050628

• 人工智能 • 上一篇

基于多模态信息融合的中文拼写纠错算法

张庆¹^,², 杨凡¹^,²(), 方宇涵¹^,²

^1.中国科学院成都计算机应用研究所，成都 610213
^2.中国科学院大学，北京 100049

收稿日期:2024-05-17 修回日期:2024-12-10 接受日期:2024-12-26 发布日期:2025-01-03 出版日期:2025-05-10
通讯作者: 杨凡
作者简介:张庆（2000—），男，山西阳泉人，硕士研究生，主要研究方向：自然语言处理、大数据分析、机器学习
杨凡（1978—），男，江苏丹阳人，高级工程师，博士，主要研究方向：大数据、人工智能、工业软件
方宇涵（2000—），男，四川资阳人，硕士研究生，主要研究方向：深度学习、自然语言处理。
基金资助:
四川省科技计划项目(24QYCX0229);成都市重点研发支撑计划(2023-YF11-00092-HZ)

Chinese spelling correction algorithm based on multi-modal information fusion

Qing ZHANG¹^,², Fan YANG¹^,²(), Yuhan FANG¹^,²

^1.Chengdu Institute of Computer Application，Chinese Academy of Sciences，Chengdu Sichuan 610213，China
^2.University of Chinese Academy of Sciences，Beijing 100049，China

Received:2024-05-17 Revised:2024-12-10 Accepted:2024-12-26 Online:2025-01-03 Published:2025-05-10
Contact: Fan YANG
About author:ZHANG Qing， born in 2000， M. S. candidate. His research interests include natural language processing， big data analytics， machine learning.
YANG Fan， born in 1978， Ph. D.， senior engineer. His research interests include big data， artificial intelligence， industrial software.
FANG Yuhan， born in 2000， M. S. candidate. His research interests include deep learning， natural language processing.
Supported by:
Science and Technology Program of Sichuan Province(24QYCX0229);Key Research and Development Support Program of Chengdu(2023-YF11-00092-HZ)

摘要/Abstract

摘要：

中文拼写纠错（CSC）的目标是检测和修正用户输入中文文本中的字或词级别的错误，这些错误通常是由于汉字之间的语义、字音或字形相似而导致的误用。然而，现有模型通常忽略了局部信息，无法充分捕捉不同汉字之间的字音和字形相似性，也无法有效地将这些信息与语义信息结合起来。为了解决这些问题，提出一种基于多模态信息融合的CSC算法PWSpell。该算法利用卷积注意力机制关注局部语义信息，利用拼音编码捕捉汉字之间的字音相似关系，并首次将五笔编码引入CSC领域，用于捕捉汉字之间的字形相似关系。此外，将这2种相似关系与经过BERT（Bidirectional Encoder Representation from Transformers）处理的语义信息进行选择性融合。实验结果表明，PWSpell在SIGHAN 2015测试集的检测级指标上准确率、精确率、F1值以及校正级指标精确率、F1值上均有提升，其中校正级的精确率至少提升了1个百分点；消融实验结果也验证了算法中各个模块的设计都能有效提升模型的性能。

关键词: 中文自然语言处理, 中文拼写纠错, BERT, 多模态信息融合, 局部信息

Abstract:

The goal of Chinese Spelling Correction （CSC） is to detect and correct character or word-level errors in user-input Chinese text， which commonly arise from semantic， phonetic， or glyphic similarities among Chinese characters. However， existing models often neglect local information， and fail to fully capture phonetic and glyphic similarities among different Chinese characters， as well as effectively integrate these similarities with semantic information. To address these issues， a new CSC algorithm based on multimodal information fusion was proposed， namely PWSpell. This algorithm utilized a convolutional attention mechanism to focus on local semantic information， employed Pinyin encoding to capture phonetic similarities among characters， and， for the first time， introduced Wubi encoding into the CSC domain for capturing glyphic similarities among Chinese characters. Additionally， it selectively integrated these two types of similarity information with semantic information processed by BERT （Bidirectional Encoder Representation from Transformers）. Experimental results demonstrate that PWSpell improves error detection accuracy， precision， F1-score， as well as correction precision and F1-score on SIGHAN 2015 test set， with at least one percentage point increase in correction precision. Ablation experimental results also validate that the design of each module in PWSpell effectively improves its performance.

Key words: Chinese natural language processing, Chinese Spelling Correction (CSC), BERT (Bidirectional Encoder Representation from Transformers), multimodal information fusion, local information

中图分类号:

TP391.1

张庆, 杨凡, 方宇涵. 基于多模态信息融合的中文拼写纠错算法[J]. 计算机应用, 2025, 45(5): 1528-1534.

Qing ZHANG, Fan YANG, Yuhan FANG. Chinese spelling correction algorithm based on multi-modal information fusion[J]. Journal of Computer Applications, 2025, 45(5): 1528-1534.

图/表 10

参考文献 19

1	MARTINS B， SILVA M J. Spelling correction for search engine queries［C］// Proceedings of the 2004 International Conference on Natural Language Processing （in Spain）， LNCS 3230. Berlin： Springer， 2004： 372-383.
2	AFLI H， QIU Z， WAY A， et al. Using SMT for OCR error correction of historical texts［C］// Proceedings of the 10th International Conference on Language Resources and Evaluation. Paris： ELRA， 2016： 962-966.
3	HINTON G， DENG L， YU D， et al. Deep neural networks for acoustic modeling in speech recognition： the shared views of four research groups［J］. IEEE Signal Processing Magazine， 2012， 29（6）： 82-97.
4	LIU C L， LAI M H， CHUANG Y H， et al. Visually and phonologically similar characters in incorrect simplified Chinese words［C］// Proceedings of the 23rd International Conference on Computational Linguistics： Posters. ［S.l.］： Coling 2010 Organizing Committee， 2010： 739-747.
5	HONG Y， YU X， HE N， et al. FASPell： a fast， adaptable， simple， powerful Chinese spell checker based on DAE-decoder paradigm［C］// Proceedings of the 5th Workshop on Noisy User-generated Text. Stroudsburg： ACL， 2019： 160-169.
6	CHENG X， XU W， CHEN K， et al. SpellGCN： incorporating phonological and visual similarities into language models for Chinese spelling check［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 871-881.
7	WANG B， CHE W， WU D， et al. Dynamic connected networks for Chinese spelling check［C］// Findings of the Association for Computational Linguistics： ACL-IJCNLP 2021. Stroudsburg： ACL， 2021： 2437-2446.
8	GUO Z， NI Y， WANG K， et al. Global attention decoder for Chinese spelling error correction［C］// Proceedings of the 2021 Findings of the Association for Computational Linguistics. Stroudsburg： ACL， 2021： 1419-1428.
9	LIU S， YANG T， YUE T， et al. PLOME： pre-training with misspelled knowledge for Chinese spelling correction［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 2991-3000.
10	XU H D， LI Z， ZHOU Q， et al. Read， listen， and see： leveraging multimodal information helps Chinese spell checking［C］// Findings of the Association for Computational Linguistics： ACL-IJCNLP 2021. Stroudsburg： ACL， 2021： 716-728.
11	LV Q， CAO Z， GENG L， et al. General and domain-adaptive Chinese spelling check with error-consistent pretraining［J］. ACM Transactions on Asian and Low-Resource Language Information Processing， 2023， 22（5）： No.124.
12	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
13	TSENG Y H， LEE L H， CHANG L P， et al. Introduction to SIGHAN 2015 bake-off for Chinese spelling check［C］// Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing. Stroudsburg： ACL， 2015： 32-37.
14	JI T， YAN H， QIUX. SpellBERT： a lightweight pretrained model for Chinese spelling check［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 3544-3551.
15	WANG D， SONG Y， LI J， et al. A hybrid approach to automatic corpus generation for Chinese spelling check［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018： 2517-2527.
16	CUI Y， CHE W， LIU T， et al. Pre-training with whole word masking for Chinese BERT［J］. IEEE/ACM Transactions on Audio， Speech， and Language Processing， 2021， 29： 3504-3514.
17	潘广铭.基于多模态的中文拼写纠错方法研究［D］.北京：北方工业大学，2024：24-28.
	PAN G M. Research on multi-modal Chinese spelling correction［D］. Beijing： North China University of Technology， 2024： 24-28.
18	苏锦钿，余珊珊，洪晓斌.一种面向中文拼写纠错的自监督预训练方法［J］.华南理工大学学报（自然科学版），2023，51（9）：90-98.
	SU J D， YU S S， HONG X B. A self-supervised pre-training method for Chinese spelling correction［J］. Journal of South China University of Technology （Natural Science Edition）， 2023，51（9）： 90-98.
19	WANG Y， WANG Y， LIU Y. Chinese spelling correction method based on multi-feature fusion and attention mechanism［C］// Proceedings of the 3rd International Conference on Computer， Artificial Intelligence and Control Engineering. New York： ACM， 2024： 481-487.

错误句子	正确句子	修改	错误类型
今天完的开心吗？	今天玩的开心吗？	（完& 玩）	字音相近
我门来了	我们来了	（门& 们）	字形相近
这里的湖水真青澈	这里的湖水真清澈	（青& 清）	字音相近+字形相近
他由于顶不住压迫而丧失了原则	他由于顶不住压力而丧失了原则	（迫& 力）	不属于拼写错误

错误句子	正确句子	修改	错误类型
今天完的开心吗？	今天玩的开心吗？	（完& 玩）	字音相近
我门来了	我们来了	（门& 们）	字形相近
这里的湖水真青澈	这里的湖水真清澈	（青& 清）	字音相近+字形相近
他由于顶不住压迫而丧失了原则	他由于顶不住压力而丧失了原则	（迫& 力）	不属于拼写错误

示例1（字形相似）		示例2（读音完全相同）
汉字	五笔码	汉字	五笔码
请	ygeg	化	wxn-
清	igeg	画	glbj
情	ngeg	话	ytdg
晴	jgeg	桦	swxf

示例1（字形相似）		示例2（读音完全相同）
汉字	五笔码	汉字	五笔码
请	ygeg	化	wxn-
清	igeg	画	glbj
情	ngeg	话	ytdg
晴	jgeg	桦	swxf

数据集		句子数	句子平均长度	错误数
训练集	Wang271K	271 329	42.6	381 962
	SIGHAN 2013	700	41.8	343
	SIGHAN 2014	3 437	49.5	5 134
	SIGHAN 2015	2 339	31.3	3 038
	合计	277 805	42.6	390 477
测试集	SIGHAN 2015	1 100	30.6	703

基于多模态信息融合的中文拼写纠错算法

Chinese spelling correction algorithm based on multi-modal information fusion

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 19

相关文章 15

编辑推荐

Metrics

算法	检测级				校正级
算法	Acc	Pre	Rec	F1	Acc	Pre	Rec	F1
FASPell	74.2	67.6	60.6	63.5	73.7	66.6	59.1	62.6
SpellGCN	—	74.8	80.7	77.7	—	72.1	77.7	75.9
RoBERTa-DCN	—	76.6	79.8	78.2	—	74.2	77.3	75.7
GAD	—	75.6	80.4	77.9	—	73.2	77.8	75.4
PLOME	—	77.4	81.5	79.4	—	75.3	79.3	77.2
Realize	84.7	77.3	81.3	79.3	84.0	75.9	79.9	77.8
ECSpell	83.4	76.4	79.9	78.1	82.4	74.4	77.9	76.1
MCSpell	81.2	75.1	82.4	79.2	87.9	74.6	72.4	76.3
MASC-MacBERT	—	76.6	81.8	78.8	—	73.5	77.8	75.6
M-A-CSC	84.4	77.0	80.3	78.6	83.0	75.9	79.9	76.9
PWSpell	85.2	78.1	81.1	79.6	84.5	76.9	79.9	78.3

算法		Acc	Pre	Rec	F1
检测级	PWSpell	85.2	78.1	81.1	79.6
	PWSpell-a	81.7	71.6	79.1	75.2
	PWSpell-b	81.4	71.4	78.6	74.8
	PWSpell-c	83.3	76.1	80.3	78.0
	PWSpell-d	83.5	73.7	78.9	76.3
	PWSpell-e	84.0	76.6	80.6	78.6
校正级	PWSpell	84.5	76.9	79.9	78.3
	PWSpell-a	81.0	70.2	77.6	73.7
	PWSpell-b	80.5	69.7	76.7	73.1
	PWSpell-c	82.3	74.2	78.0	76.0
	PWSpell-d	82.2	71.3	76.3	73.8
	PWSpell-e	83.2	75.0	78.9	76.9

[1]	杨定木, 倪龙强, 梁晶, 邱照原, 张永真, 齐志强. 基于语义相似度的协议转换方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1263-1270.
[2]	马灿, 黄瑞章, 任丽娜, 白瑞娜, 伍瑶瑶. 基于大语言模型的多输入中文拼写纠错方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 849-855.
[3]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[4]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[5]	余杭, 周艳玲, 翟梦鑫, 刘涵. 基于预训练模型与标签融合的文本分类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 709-714.
[6]	赖华, 孙童, 王文君, 余正涛, 高盛祥, 董凌. 多模态特征的越南语语音识别文本标点恢复[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 418-423.
[7]	拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2116-2124.
[8]	林呈宇, 王雷, 薛聪. 标签语义增强的弱监督文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 335-342.
[9]	徐铭, 李林昊, 齐巧玲, 王利琴. 基于注意力平衡列表的溯因推理模型[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 349-355.
[10]	宋其洪, 刘建勋, 扈海泽, 张祥平. 基于协同融合网络的代码搜索模型[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3896-3902.
[11]	夏飞, 陈帅琦, 华珉, 蒋碧鸿. 基于改进BERT的电力领域中文分词方法[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3711-3718.
[12]	李宇航, 杨玉丽, 马垚, 于丹, 陈永乐. 基于BERT模型的文本对抗样本生成方法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3093-3098.
[13]	左敏, 王虹, 颜文婧, 张青川. 基于BERT和CNN的基因剪接位点识别[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3309-3314.
[14]	焦守龙, 段友祥, 孙歧峰, 庄子浩, 孙琛皓. 融合实体描述信息和邻居节点特征的知识表示学习方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1050-1056.
[15]	张海丰, 曾诚, 潘列, 郝儒松, 温超东, 何鹏. 结合BERT和特征投影网络的新闻主题文本分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1116-1124.

算法	CPU时间/ms	GPU时间/ms
PWSpell	106.5	1.4
Realize	112.1	2.0

算法	CPU时间/ms	GPU时间/ms
PWSpell	106.5	1.4
Realize	112.1	2.0