AxBERT: An Interpretable Chinese Spelling Correction Method Driven by Associative Knowledge Network
Fanyu Wang, Hangyu Zhu, Zhenping Xie
TL;DR
AxBERT tackles the interpretability gap in Chinese spelling correction by grounding a BERT-based model in an Associative Knowledge Network (AKN) and coupling semantic alignment with attention regulation. The method quantifies BERT’s internal logic as a transforming matrix $M_T$, maps it to an interpretable statistic space via a translator $M_F$, and aligns attention with $M_S$ using a weight regulator $M_W$, optimized with a joint loss $L = \lambda L_C + (1-\lambda)L_A$ where $\lambda=0.8$. Experimental results on SIGHAN datasets show AxBERT achieves strong sentence-level precision and F1, with interpretable analyses confirming meaningful alignment between attention and AKN; a case study demonstrates controllable corrections guided by the regulator. The approach offers a path to generalizable, interpretable spelling correction and can be extended to other languages by initializing AKN from language-specific corpora or adapted to additional NLP tasks requiring semantic regulation and interpretability, with future work exploring decoder-level non-autoregressive configurations and domain-specific AKN adaptations.
Abstract
Deep learning has shown promising performance on various machine learning tasks. Nevertheless, the uninterpretability of deep learning models severely restricts the usage domains that require feature explanations, such as text correction. Therefore, a novel interpretable deep learning model (named AxBERT) is proposed for Chinese spelling correction by aligning with an associative knowledge network (AKN). Wherein AKN is constructed based on the co-occurrence relations among Chinese characters, which denotes the interpretable statistic logic contrasted with uninterpretable BERT logic. And a translator matrix between BERT and AKN is introduced for the alignment and regulation of the attention component in BERT. In addition, a weight regulator is designed to adjust the attention distributions in BERT to appropriately model the sentence semantics. Experimental results on SIGHAN datasets demonstrate that AxBERT can achieve extraordinary performance, especially upon model precision compared to baselines. Our interpretable analysis, together with qualitative reasoning, can effectively illustrate the interpretability of AxBERT.
