Journal of Computer Applications

    Next Articles

Chinese Spelling Correction Algorithm Based on Multi-Modal Information Fusion

ZHANG Qing1,2, YANG Fan1,2*, FANG Yuhan1,2   

  1. 1. Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu Sichuan 610213, China; 2. University of Chinese Academy of Sciences, Beijing 100049, China

  • Received:2024-05-17 Revised:2024-12-19 Accepted:2024-12-26 Online:2025-01-03 Published:2025-01-03
  • Contact: Fan YANG

基于多模态信息融合的中文拼写纠错算法

张庆1,2,杨凡1,2*,方宇涵 1,2   

  1. 1.中国科学院 成都计算机应用研究所,成都610213; 2.中国科学院大学,北京100049
  • 通讯作者: 杨凡

Abstract: The goal of Chinese spelling correction is to detect and correct character or word-level errors in user-inputted Chinese text. These errors are usually due to the misuse of characters that have similar semantics, phonetics, or glyphs. However, existing models often neglect local information, failing to fully capture the phonetic and glyph similarities between different Chinese characters, and are unable to effectively integrate these similarities with semantic information. To address these issues, we propose a Chinese spelling correction algorithm based on multimodal information fusion, named PWSpell. This algorithm utilizes a convolutional attention mechanism to focus on local semantic information, employs pinyin encoding to capture phonetic similarities between characters, and, for the first time, introduces Wubi encoding into the Chinese spelling correction domain to capture glyph similarities between characters. Additionally, it selectively fuses these two types of similarity information with semantic information processed by BERT. The experimental results show that the proposed model improves error detection accuracy, precision, F1-score, as well as correction precision and F1-score on the SIGHAN15 test set, with a 1% increase in correction precision. Ablation studies further validate that the design of each module in the model effectively enhances its performance.

Key words: Chinese natural language process, Chinese Spell Correction, BERT, Multimodal Information Fusion, local information

摘要: 中文拼写纠错的目标是检测和修正用户输入中文文本中的字或词级别的错误。这些错误通常由于汉字之间的语义、字音或字形相似而导致的误用。然而,现有模型通常忽略了局部信息,无法充分捕捉不同汉字之间的字音和字形相似性,也无法有效地将这些信息与语义信息结合起来。为了解决这些问题,提出了一种基于多模态信息融合的中文拼写纠错算法——PWSpell。该算法利用卷积注意力机制关注局部语义信息,利用拼音编码捕捉汉字之间的字音相似关系,并首次将五笔编码引入中文拼写纠错领域,用于捕捉汉字之间的字形相似关系。此外,将这两种相似关系与经过BERT处理的语义信息进行选择性融合。实验结果表明,本文提出的模型在SIGHAN15测试集的检错准确率、精确率、F1值以及纠错精确率、F1值上均有提升,其中纠错精确率提升1%,消融效果进一步验证了模型中各个模块的设计都能有效提升模型的性能。

关键词: 中文自然语言处理, 中文拼写纠错, BERT, 多模态信息融合, 局部信息

CLC Number: