Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Chinese spelling correction algorithm based on multi-modal information fusion

Qing ZHANG, Fan YANG, Yuhan FANG

Journal of Computer Applications 2025, 45 (5): 1528-1534. DOI: 10.11772/j.issn.1001-9081.2024050628

Abstract （90）

HTML （4）

PDF （1480KB）（45）

Save

The goal of Chinese Spelling Correction （CSC） is to detect and correct character or word-level errors in user-input Chinese text， which commonly arise from semantic， phonetic， or glyphic similarities among Chinese characters. However， existing models often neglect local information， and fail to fully capture phonetic and glyphic similarities among different Chinese characters， as well as effectively integrate these similarities with semantic information. To address these issues， a new CSC algorithm based on multimodal information fusion was proposed， namely PWSpell. This algorithm utilized a convolutional attention mechanism to focus on local semantic information， employed Pinyin encoding to capture phonetic similarities among characters， and， for the first time， introduced Wubi encoding into the CSC domain for capturing glyphic similarities among Chinese characters. Additionally， it selectively integrated these two types of similarity information with semantic information processed by BERT （Bidirectional Encoder Representation from Transformers）. Experimental results demonstrate that PWSpell improves error detection accuracy， precision， F1-score， as well as correction precision and F1-score on SIGHAN 2015 test set， with at least one percentage point increase in correction precision. Ablation experimental results also validate that the design of each module in PWSpell effectively improves its performance.

Table and Figures | Reference | Related Articles | Metrics