Table of Contents
Fetching ...

PE: A Poincare Explanation Method for Fast Text Hierarchy Generation

Qian Chen, Dongyang Li, Xiaofeng He, Hongzhao Li, Hongyu Yi

TL;DR

PE addresses the opacity of deep NLP models by modeling feature interactions through hyperbolic geometry on the Poincaré ball with curvature $c=1$. It projects word embeddings into two hyperbolic spaces to capture semantic and syntactic structure, assigns feature contributions via a direct Shapley-style score, and decodes a text hierarchy as a minimum spanning tree in the projected space. The approach yields fast, non-contiguous interaction-aware HA with strong empirical support across three datasets, outperforming baselines in AOPC and construction time. This creates a scalable, linguistically informed explanation framework for large NLP models.

Abstract

The black-box nature of deep learning models in NLP hinders their widespread application. The research focus has shifted to Hierarchical Attribution (HA) for its ability to model feature interactions. Recent works model non-contiguous combinations with a time-costly greedy search in Eculidean spaces, neglecting underlying linguistic information in feature representations. In this work, we introduce a novel method, namely Poincare Explanation (PE), for modeling feature interactions with hyperbolic spaces in a time efficient manner. Specifically, we take building text hierarchies as finding spanning trees in hyperbolic spaces. First we project the embeddings into hyperbolic spaces to elicit inherit semantic and syntax hierarchical structures. Then we propose a simple yet effective strategy to calculate Shapley score. Finally we build the the hierarchy with proving the constructing process in the projected space could be viewed as building a minimum spanning tree and introduce a time efficient building algorithm. Experimental results demonstrate the effectiveness of our approach.

PE: A Poincare Explanation Method for Fast Text Hierarchy Generation

TL;DR

PE addresses the opacity of deep NLP models by modeling feature interactions through hyperbolic geometry on the Poincaré ball with curvature . It projects word embeddings into two hyperbolic spaces to capture semantic and syntactic structure, assigns feature contributions via a direct Shapley-style score, and decodes a text hierarchy as a minimum spanning tree in the projected space. The approach yields fast, non-contiguous interaction-aware HA with strong empirical support across three datasets, outperforming baselines in AOPC and construction time. This creates a scalable, linguistically informed explanation framework for large NLP models.

Abstract

The black-box nature of deep learning models in NLP hinders their widespread application. The research focus has shifted to Hierarchical Attribution (HA) for its ability to model feature interactions. Recent works model non-contiguous combinations with a time-costly greedy search in Eculidean spaces, neglecting underlying linguistic information in feature representations. In this work, we introduce a novel method, namely Poincare Explanation (PE), for modeling feature interactions with hyperbolic spaces in a time efficient manner. Specifically, we take building text hierarchies as finding spanning trees in hyperbolic spaces. First we project the embeddings into hyperbolic spaces to elicit inherit semantic and syntax hierarchical structures. Then we propose a simple yet effective strategy to calculate Shapley score. Finally we build the the hierarchy with proving the constructing process in the projected space could be viewed as building a minimum spanning tree and introduce a time efficient building algorithm. Experimental results demonstrate the effectiveness of our approach.
Paper Structure (24 sections, 25 equations, 11 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 25 equations, 11 figures, 4 tables, 1 algorithm.

Figures (11)

  • Figure 1: Pearson correlation $\rho$ results from ig_v1 with BERT and LSTM on SST-2 and Yelp datasets. A higher correlation coefficient indicates a stronger ability of the method to identify important words.
  • Figure 2: Left: The projection illustration for positive example "It was an interesting but somewhat draggy movie." The centre represents the prototype for the positive label. Right: A negative example "It was a draggy but somewhat interesting movie." The center point stands for the negative label.
  • Figure 3: Three different binary tree types rooted from $j\vee j^\prime\vee u$.
  • Figure 4: Evaluation results of Ablation Study.
  • Figure 5: PE,$\text{HE}_{LOO}$ for BERT on two examples from the Rotten Tomatoes dataset. The subtree in the upper right corner is generated by PE and the lower is produced by $\text{HE}_{LOO}$.
  • ...and 6 more figures

Theorems & Definitions (1)

  • proof