PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure
Feiqi Cao, Caren Han, Hyunsuk Chung
TL;DR
PEACH addresses the interpretability gap in pretrained language models by translating contextual embeddings into a globally interpretable, tree-based explanation framework. It introduces a flexible pipeline that fine-tunes PLMs, reduces and clusters embedding features, and builds decision trees (or forests) whose leaves encode class distributions and whose nodes reveal representative text via word-cloud prototypes. The approach integrates interpretability-focused visuals (prototype nodes, PoS/NER filters, word matching) to enable both global and local explanations and dataset debugging. Empirical results across nine NLP benchmarks show that PEACH maintains competitive accuracy while enhancing transparency, supported by a human-evaluation study that favors PEACH over LIME and Anchor. A medical-domain case study underscores the practical value of domain-specific pretraining for interpretable explanations.
Abstract
In this work, we propose a novel tree-based explanation technique, PEACH (Pretrained-embedding Explanation Across Contextual and Hierarchical Structure), that can explain how text-based documents are classified by using any pretrained contextual embeddings in a tree-based human-interpretable manner. Note that PEACH can adopt any contextual embeddings of the PLMs as a training input for the decision tree. Using the proposed PEACH, we perform a comprehensive analysis of several contextual embeddings on nine different NLP text classification benchmarks. This analysis demonstrates the flexibility of the model by applying several PLM contextual embeddings, its attribute selections, scaling, and clustering methods. Furthermore, we show the utility of explanations by visualising the feature selection and important trend of text classification via human-interpretable word-cloud-based trees, which clearly identify model mistakes and assist in dataset debugging. Besides interpretability, PEACH outperforms or is similar to those from pretrained models.
