Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Chinese spelling correction algorithm based on multi-modal information fusion
Qing ZHANG, Fan YANG, Yuhan FANG
Journal of Computer Applications    2025, 45 (5): 1528-1534.   DOI: 10.11772/j.issn.1001-9081.2024050628
Abstract90)   HTML4)    PDF (1480KB)(45)       Save

The goal of Chinese Spelling Correction (CSC) is to detect and correct character or word-level errors in user-input Chinese text, which commonly arise from semantic, phonetic, or glyphic similarities among Chinese characters. However, existing models often neglect local information, and fail to fully capture phonetic and glyphic similarities among different Chinese characters, as well as effectively integrate these similarities with semantic information. To address these issues, a new CSC algorithm based on multimodal information fusion was proposed, namely PWSpell. This algorithm utilized a convolutional attention mechanism to focus on local semantic information, employed Pinyin encoding to capture phonetic similarities among characters, and, for the first time, introduced Wubi encoding into the CSC domain for capturing glyphic similarities among Chinese characters. Additionally, it selectively integrated these two types of similarity information with semantic information processed by BERT (Bidirectional Encoder Representation from Transformers). Experimental results demonstrate that PWSpell improves error detection accuracy, precision, F1-score, as well as correction precision and F1-score on SIGHAN 2015 test set, with at least one percentage point increase in correction precision. Ablation experimental results also validate that the design of each module in PWSpell effectively improves its performance.

Table and Figures | Reference | Related Articles | Metrics