Chinese spelling correction algorithm based on multi-modal information fusion

doi:10.11772/j.issn.1001-9081.2024050628

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (5): 1528-1534.DOI: 10.11772/j.issn.1001-9081.2024050628

• Artificial intelligence • Previous Articles

Chinese spelling correction algorithm based on multi-modal information fusion

Qing ZHANG¹^,², Fan YANG¹^,²(), Yuhan FANG¹^,²

^1.Chengdu Institute of Computer Application，Chinese Academy of Sciences，Chengdu Sichuan 610213，China
^2.University of Chinese Academy of Sciences，Beijing 100049，China

Received:2024-05-17 Revised:2024-12-10 Accepted:2024-12-26 Online:2025-01-03 Published:2025-05-10
Contact: Fan YANG
About author:ZHANG Qing， born in 2000， M. S. candidate. His research interests include natural language processing， big data analytics， machine learning.
YANG Fan， born in 1978， Ph. D.， senior engineer. His research interests include big data， artificial intelligence， industrial software.
FANG Yuhan， born in 2000， M. S. candidate. His research interests include deep learning， natural language processing.
Supported by:
Science and Technology Program of Sichuan Province(24QYCX0229);Key Research and Development Support Program of Chengdu(2023-YF11-00092-HZ)

基于多模态信息融合的中文拼写纠错算法

张庆¹^,², 杨凡¹^,²(), 方宇涵¹^,²

^1.中国科学院成都计算机应用研究所，成都 610213
^2.中国科学院大学，北京 100049

通讯作者: 杨凡
作者简介:张庆（2000—），男，山西阳泉人，硕士研究生，主要研究方向：自然语言处理、大数据分析、机器学习
杨凡（1978—），男，江苏丹阳人，高级工程师，博士，主要研究方向：大数据、人工智能、工业软件
方宇涵（2000—），男，四川资阳人，硕士研究生，主要研究方向：深度学习、自然语言处理。
基金资助:
四川省科技计划项目(24QYCX0229);成都市重点研发支撑计划(2023-YF11-00092-HZ)

Abstract

Abstract:

The goal of Chinese Spelling Correction （CSC） is to detect and correct character or word-level errors in user-input Chinese text， which commonly arise from semantic， phonetic， or glyphic similarities among Chinese characters. However， existing models often neglect local information， and fail to fully capture phonetic and glyphic similarities among different Chinese characters， as well as effectively integrate these similarities with semantic information. To address these issues， a new CSC algorithm based on multimodal information fusion was proposed， namely PWSpell. This algorithm utilized a convolutional attention mechanism to focus on local semantic information， employed Pinyin encoding to capture phonetic similarities among characters， and， for the first time， introduced Wubi encoding into the CSC domain for capturing glyphic similarities among Chinese characters. Additionally， it selectively integrated these two types of similarity information with semantic information processed by BERT （Bidirectional Encoder Representation from Transformers）. Experimental results demonstrate that PWSpell improves error detection accuracy， precision， F1-score， as well as correction precision and F1-score on SIGHAN 2015 test set， with at least one percentage point increase in correction precision. Ablation experimental results also validate that the design of each module in PWSpell effectively improves its performance.

Key words: Chinese natural language processing, Chinese Spelling Correction (CSC), BERT (Bidirectional Encoder Representation from Transformers), multimodal information fusion, local information

摘要：

中文拼写纠错（CSC）的目标是检测和修正用户输入中文文本中的字或词级别的错误，这些错误通常是由于汉字之间的语义、字音或字形相似而导致的误用。然而，现有模型通常忽略了局部信息，无法充分捕捉不同汉字之间的字音和字形相似性，也无法有效地将这些信息与语义信息结合起来。为了解决这些问题，提出一种基于多模态信息融合的CSC算法PWSpell。该算法利用卷积注意力机制关注局部语义信息，利用拼音编码捕捉汉字之间的字音相似关系，并首次将五笔编码引入CSC领域，用于捕捉汉字之间的字形相似关系。此外，将这2种相似关系与经过BERT（Bidirectional Encoder Representation from Transformers）处理的语义信息进行选择性融合。实验结果表明，PWSpell在SIGHAN 2015测试集的检测级指标上准确率、精确率、F1值以及校正级指标精确率、F1值上均有提升，其中校正级的精确率至少提升了1个百分点；消融实验结果也验证了算法中各个模块的设计都能有效提升模型的性能。

关键词: 中文自然语言处理, 中文拼写纠错, BERT, 多模态信息融合, 局部信息

CLC Number:

TP391.1

Qing ZHANG, Fan YANG, Yuhan FANG. Chinese spelling correction algorithm based on multi-modal information fusion[J]. Journal of Computer Applications, 2025, 45(5): 1528-1534.

张庆, 杨凡, 方宇涵. 基于多模态信息融合的中文拼写纠错算法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1528-1534.

Figures/Tables 10

References 19

1	MARTINS B， SILVA M J. Spelling correction for search engine queries［C］// Proceedings of the 2004 International Conference on Natural Language Processing （in Spain）， LNCS 3230. Berlin： Springer， 2004： 372-383.
2	AFLI H， QIU Z， WAY A， et al. Using SMT for OCR error correction of historical texts［C］// Proceedings of the 10th International Conference on Language Resources and Evaluation. Paris： ELRA， 2016： 962-966.
3	HINTON G， DENG L， YU D， et al. Deep neural networks for acoustic modeling in speech recognition： the shared views of four research groups［J］. IEEE Signal Processing Magazine， 2012， 29（6）： 82-97.
4	LIU C L， LAI M H， CHUANG Y H， et al. Visually and phonologically similar characters in incorrect simplified Chinese words［C］// Proceedings of the 23rd International Conference on Computational Linguistics： Posters. ［S.l.］： Coling 2010 Organizing Committee， 2010： 739-747.
5	HONG Y， YU X， HE N， et al. FASPell： a fast， adaptable， simple， powerful Chinese spell checker based on DAE-decoder paradigm［C］// Proceedings of the 5th Workshop on Noisy User-generated Text. Stroudsburg： ACL， 2019： 160-169.
6	CHENG X， XU W， CHEN K， et al. SpellGCN： incorporating phonological and visual similarities into language models for Chinese spelling check［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 871-881.
7	WANG B， CHE W， WU D， et al. Dynamic connected networks for Chinese spelling check［C］// Findings of the Association for Computational Linguistics： ACL-IJCNLP 2021. Stroudsburg： ACL， 2021： 2437-2446.
8	GUO Z， NI Y， WANG K， et al. Global attention decoder for Chinese spelling error correction［C］// Proceedings of the 2021 Findings of the Association for Computational Linguistics. Stroudsburg： ACL， 2021： 1419-1428.
9	LIU S， YANG T， YUE T， et al. PLOME： pre-training with misspelled knowledge for Chinese spelling correction［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 2991-3000.
10	XU H D， LI Z， ZHOU Q， et al. Read， listen， and see： leveraging multimodal information helps Chinese spell checking［C］// Findings of the Association for Computational Linguistics： ACL-IJCNLP 2021. Stroudsburg： ACL， 2021： 716-728.
11	LV Q， CAO Z， GENG L， et al. General and domain-adaptive Chinese spelling check with error-consistent pretraining［J］. ACM Transactions on Asian and Low-Resource Language Information Processing， 2023， 22（5）： No.124.
12	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
13	TSENG Y H， LEE L H， CHANG L P， et al. Introduction to SIGHAN 2015 bake-off for Chinese spelling check［C］// Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing. Stroudsburg： ACL， 2015： 32-37.
14	JI T， YAN H， QIUX. SpellBERT： a lightweight pretrained model for Chinese spelling check［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 3544-3551.
15	WANG D， SONG Y， LI J， et al. A hybrid approach to automatic corpus generation for Chinese spelling check［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018： 2517-2527.
16	CUI Y， CHE W， LIU T， et al. Pre-training with whole word masking for Chinese BERT［J］. IEEE/ACM Transactions on Audio， Speech， and Language Processing， 2021， 29： 3504-3514.
17	潘广铭.基于多模态的中文拼写纠错方法研究［D］.北京：北方工业大学，2024：24-28.
	PAN G M. Research on multi-modal Chinese spelling correction［D］. Beijing： North China University of Technology， 2024： 24-28.
18	苏锦钿，余珊珊，洪晓斌.一种面向中文拼写纠错的自监督预训练方法［J］.华南理工大学学报（自然科学版），2023，51（9）：90-98.
	SU J D， YU S S， HONG X B. A self-supervised pre-training method for Chinese spelling correction［J］. Journal of South China University of Technology （Natural Science Edition）， 2023，51（9）： 90-98.
19	WANG Y， WANG Y， LIU Y. Chinese spelling correction method based on multi-feature fusion and attention mechanism［C］// Proceedings of the 3rd International Conference on Computer， Artificial Intelligence and Control Engineering. New York： ACM， 2024： 481-487.

错误句子	正确句子	修改	错误类型
今天完的开心吗？	今天玩的开心吗？	（完& 玩）	字音相近
我门来了	我们来了	（门& 们）	字形相近
这里的湖水真青澈	这里的湖水真清澈	（青& 清）	字音相近+字形相近
他由于顶不住压迫而丧失了原则	他由于顶不住压力而丧失了原则	（迫& 力）	不属于拼写错误

错误句子	正确句子	修改	错误类型
今天完的开心吗？	今天玩的开心吗？	（完& 玩）	字音相近
我门来了	我们来了	（门& 们）	字形相近
这里的湖水真青澈	这里的湖水真清澈	（青& 清）	字音相近+字形相近
他由于顶不住压迫而丧失了原则	他由于顶不住压力而丧失了原则	（迫& 力）	不属于拼写错误

示例1（字形相似）		示例2（读音完全相同）
汉字	五笔码	汉字	五笔码
请	ygeg	化	wxn-
清	igeg	画	glbj
情	ngeg	话	ytdg
晴	jgeg	桦	swxf

示例1（字形相似）		示例2（读音完全相同）
汉字	五笔码	汉字	五笔码
请	ygeg	化	wxn-
清	igeg	画	glbj
情	ngeg	话	ytdg
晴	jgeg	桦	swxf

数据集		句子数	句子平均长度	错误数
训练集	Wang271K	271 329	42.6	381 962
	SIGHAN 2013	700	41.8	343
	SIGHAN 2014	3 437	49.5	5 134
	SIGHAN 2015	2 339	31.3	3 038
	合计	277 805	42.6	390 477
测试集	SIGHAN 2015	1 100	30.6	703

Chinese spelling correction algorithm based on multi-modal information fusion

基于多模态信息融合的中文拼写纠错算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 10

References 19

Related Articles 7

Recommended Articles

Metrics

算法	检测级				校正级
算法	Acc	Pre	Rec	F1	Acc	Pre	Rec	F1
FASPell	74.2	67.6	60.6	63.5	73.7	66.6	59.1	62.6
SpellGCN	—	74.8	80.7	77.7	—	72.1	77.7	75.9
RoBERTa-DCN	—	76.6	79.8	78.2	—	74.2	77.3	75.7
GAD	—	75.6	80.4	77.9	—	73.2	77.8	75.4
PLOME	—	77.4	81.5	79.4	—	75.3	79.3	77.2
Realize	84.7	77.3	81.3	79.3	84.0	75.9	79.9	77.8
ECSpell	83.4	76.4	79.9	78.1	82.4	74.4	77.9	76.1
MCSpell	81.2	75.1	82.4	79.2	87.9	74.6	72.4	76.3
MASC-MacBERT	—	76.6	81.8	78.8	—	73.5	77.8	75.6
M-A-CSC	84.4	77.0	80.3	78.6	83.0	75.9	79.9	76.9
PWSpell	85.2	78.1	81.1	79.6	84.5	76.9	79.9	78.3

算法		Acc	Pre	Rec	F1
检测级	PWSpell	85.2	78.1	81.1	79.6
	PWSpell-a	81.7	71.6	79.1	75.2
	PWSpell-b	81.4	71.4	78.6	74.8
	PWSpell-c	83.3	76.1	80.3	78.0
	PWSpell-d	83.5	73.7	78.9	76.3
	PWSpell-e	84.0	76.6	80.6	78.6
校正级	PWSpell	84.5	76.9	79.9	78.3
	PWSpell-a	81.0	70.2	77.6	73.7
	PWSpell-b	80.5	69.7	76.7	73.1
	PWSpell-c	82.3	74.2	78.0	76.0
	PWSpell-d	82.2	71.3	76.3	73.8
	PWSpell-e	83.2	75.0	78.9	76.9

[1]	Can MA, Ruizhang HUANG, Lina REN, Ruina BAI, Yaoyao WU. Chinese spelling correction method based on LLM with multiple inputs [J]. Journal of Computer Applications, 2025, 45(3): 849-855.
[2]	Xiayang SHI, Fengyuan ZHANG, Jiaqi YUAN, Min HUANG. Detection of unsupervised offensive speech based on multilingual BERT [J]. Journal of Computer Applications, 2022, 42(11): 3379-3385.
[3]	Lanlan ZENG, Yisong WANG, Panfeng CHEN. Named entity recognition based on BERT and joint learning for judgment documents [J]. Journal of Computer Applications, 2022, 42(10): 3011-3017.
[4]	CHEN Weiye, SUN Quansen. Image super-resolution reconstruction combined with compressed sensing and nonlocal information [J]. Journal of Computer Applications, 2016, 36(9): 2570-2575.
[5]	HUANG Xiaodong, SUN Liang. Retrieval method of images based on robust Cosine-Euclidean metric dimensionality reduction [J]. Journal of Computer Applications, 2016, 36(8): 2292-2295.
[6]	WANG Shaohua, DI Lan, LIANG Jiuzhen. Multi-dimensional fuzzy clustering image segmentation algorithm based on kernel metric and local information [J]. Journal of Computer Applications, 2015, 35(11): 3227-3231.
[7]	Jing ZHANG Hui YU. Video retrieval model based on multimodal information fusion [J]. Journal of Computer Applications, 2008, 28(1): 199-201,.

算法	CPU时间/ms	GPU时间/ms
PWSpell	106.5	1.4
Realize	112.1	2.0

算法	CPU时间/ms	GPU时间/ms
PWSpell	106.5	1.4
Realize	112.1	2.0