Table of Contents
Fetching ...

AxBERT: An Interpretable Chinese Spelling Correction Method Driven by Associative Knowledge Network

Fanyu Wang, Hangyu Zhu, Zhenping Xie

TL;DR

AxBERT tackles the interpretability gap in Chinese spelling correction by grounding a BERT-based model in an Associative Knowledge Network (AKN) and coupling semantic alignment with attention regulation. The method quantifies BERT’s internal logic as a transforming matrix $M_T$, maps it to an interpretable statistic space via a translator $M_F$, and aligns attention with $M_S$ using a weight regulator $M_W$, optimized with a joint loss $L = \lambda L_C + (1-\lambda)L_A$ where $\lambda=0.8$. Experimental results on SIGHAN datasets show AxBERT achieves strong sentence-level precision and F1, with interpretable analyses confirming meaningful alignment between attention and AKN; a case study demonstrates controllable corrections guided by the regulator. The approach offers a path to generalizable, interpretable spelling correction and can be extended to other languages by initializing AKN from language-specific corpora or adapted to additional NLP tasks requiring semantic regulation and interpretability, with future work exploring decoder-level non-autoregressive configurations and domain-specific AKN adaptations.

Abstract

Deep learning has shown promising performance on various machine learning tasks. Nevertheless, the uninterpretability of deep learning models severely restricts the usage domains that require feature explanations, such as text correction. Therefore, a novel interpretable deep learning model (named AxBERT) is proposed for Chinese spelling correction by aligning with an associative knowledge network (AKN). Wherein AKN is constructed based on the co-occurrence relations among Chinese characters, which denotes the interpretable statistic logic contrasted with uninterpretable BERT logic. And a translator matrix between BERT and AKN is introduced for the alignment and regulation of the attention component in BERT. In addition, a weight regulator is designed to adjust the attention distributions in BERT to appropriately model the sentence semantics. Experimental results on SIGHAN datasets demonstrate that AxBERT can achieve extraordinary performance, especially upon model precision compared to baselines. Our interpretable analysis, together with qualitative reasoning, can effectively illustrate the interpretability of AxBERT.

AxBERT: An Interpretable Chinese Spelling Correction Method Driven by Associative Knowledge Network

TL;DR

AxBERT tackles the interpretability gap in Chinese spelling correction by grounding a BERT-based model in an Associative Knowledge Network (AKN) and coupling semantic alignment with attention regulation. The method quantifies BERT’s internal logic as a transforming matrix , maps it to an interpretable statistic space via a translator , and aligns attention with using a weight regulator , optimized with a joint loss where . Experimental results on SIGHAN datasets show AxBERT achieves strong sentence-level precision and F1, with interpretable analyses confirming meaningful alignment between attention and AKN; a case study demonstrates controllable corrections guided by the regulator. The approach offers a path to generalizable, interpretable spelling correction and can be extended to other languages by initializing AKN from language-specific corpora or adapted to additional NLP tasks requiring semantic regulation and interpretability, with future work exploring decoder-level non-autoregressive configurations and domain-specific AKN adaptations.

Abstract

Deep learning has shown promising performance on various machine learning tasks. Nevertheless, the uninterpretability of deep learning models severely restricts the usage domains that require feature explanations, such as text correction. Therefore, a novel interpretable deep learning model (named AxBERT) is proposed for Chinese spelling correction by aligning with an associative knowledge network (AKN). Wherein AKN is constructed based on the co-occurrence relations among Chinese characters, which denotes the interpretable statistic logic contrasted with uninterpretable BERT logic. And a translator matrix between BERT and AKN is introduced for the alignment and regulation of the attention component in BERT. In addition, a weight regulator is designed to adjust the attention distributions in BERT to appropriately model the sentence semantics. Experimental results on SIGHAN datasets demonstrate that AxBERT can achieve extraordinary performance, especially upon model precision compared to baselines. Our interpretable analysis, together with qualitative reasoning, can effectively illustrate the interpretability of AxBERT.

Paper Structure

This paper contains 21 sections, 16 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Example of the character relation
  • Figure 2: Information flow in AxBERT
  • Figure 3: Overview of AxBERT's architecture. The inputting given sentence is firstly modeled by BERT in the alignment stage (Left), we quantify the transformation process of the sentence representations in BERT as the transforming matrix $\bm{M}_{T}$. (Middle) $\bm{M}_{T}$ is aligned with the associative matrix ${\bm{M}_{S}}$ with the help of the translator matrix $\bm{M}_F$. Then $\bm{M}_F$ is also used in the alignment between attention matrix $\bm{M}_A$ and ${\bm{M}_{S}}$. After the series alignment and regulation process, we compute the character-level similarity between attention and ${\bm{M}_{S}}$ within the weight regulator in the correction stage (Right). Besides, note that parameters of BERT in the alignment stage and correction are real-time sharing.
  • Figure 4: Similarity of association and attention
  • Figure 5: Retaining ratio result
  • ...and 1 more figures