Table of Contents
Fetching ...

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

Wei Liu, Zhiying Deng, Zhongyu Niu, Jun Wang, Haozhao Wang, Zhigang Zeng, Ruixuan Li

TL;DR

This work challenges the entrenched use of the maximum mutual information (MMI) criterion for rationalization by exposing diminishing marginal returns when iteratively identifying rationales. It introduces N2R, a norm-based objective that exploits the low-rank, capabity-subspace structure of neural networks to gauge which input components the model can actually utilize, using the intermediate representation norm $||Enc(Z)||_2$ as the signal. Across text and graph tasks with multiple encoders, N2R outperforms vanilla MMI and several MMI-enhanced baselines, and even matches or surpasses a representative large language model on certain datasets. The approach offers a simple, scalable alternative that can be integrated with MMI and provides a bridge between out-of-distribution detection ideas and explainability, with potential applications to pretrained encoders and broader XAI contexts.

Abstract

Extracting a small subset of crucial rationales from the full input is a key problem in explainability research. The most widely used fundamental criterion for rationale extraction is the maximum mutual information (MMI) criterion. In this paper, we first demonstrate that MMI suffers from diminishing marginal returns. Once part of the rationale has been identified, finding the remaining portions contributes only marginally to increasing the mutual information, making it difficult to use MMI to locate the rest. In contrast to MMI that aims to reproduce the prediction, we seek to identify the parts of the input that the network can actually utilize. This is achieved by comparing how different rationale candidates match the capability space of the weight matrix. The weight matrix of a neural network is typically low-rank, meaning that the linear combinations of its column vectors can only cover part of the directions in a high-dimensional space (high-dimension: the dimensions of an input vector). If an input is fully utilized by the network, {it generally matches these directions (e.g., a portion of a hypersphere), resulting in a representation with a high norm. Conversely, if an input primarily falls outside (orthogonal to) these directions}, its representation norm will approach zero, behaving like noise that the network cannot effectively utilize. Building on this, we propose using the norms of rationale candidates as an alternative objective to MMI. Through experiments on four text classification datasets and one graph classification dataset using three network architectures (GRUs, BERT, and GCN), we show that our method outperforms MMI and its improved variants in identifying better rationales. We also compare our method with a representative LLM (llama-3.1-8b-instruct) and find that our simple method gets comparable results to it and can sometimes even outperform it.

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

TL;DR

This work challenges the entrenched use of the maximum mutual information (MMI) criterion for rationalization by exposing diminishing marginal returns when iteratively identifying rationales. It introduces N2R, a norm-based objective that exploits the low-rank, capabity-subspace structure of neural networks to gauge which input components the model can actually utilize, using the intermediate representation norm as the signal. Across text and graph tasks with multiple encoders, N2R outperforms vanilla MMI and several MMI-enhanced baselines, and even matches or surpasses a representative large language model on certain datasets. The approach offers a simple, scalable alternative that can be integrated with MMI and provides a bridge between out-of-distribution detection ideas and explainability, with potential applications to pretrained encoders and broader XAI contexts.

Abstract

Extracting a small subset of crucial rationales from the full input is a key problem in explainability research. The most widely used fundamental criterion for rationale extraction is the maximum mutual information (MMI) criterion. In this paper, we first demonstrate that MMI suffers from diminishing marginal returns. Once part of the rationale has been identified, finding the remaining portions contributes only marginally to increasing the mutual information, making it difficult to use MMI to locate the rest. In contrast to MMI that aims to reproduce the prediction, we seek to identify the parts of the input that the network can actually utilize. This is achieved by comparing how different rationale candidates match the capability space of the weight matrix. The weight matrix of a neural network is typically low-rank, meaning that the linear combinations of its column vectors can only cover part of the directions in a high-dimensional space (high-dimension: the dimensions of an input vector). If an input is fully utilized by the network, {it generally matches these directions (e.g., a portion of a hypersphere), resulting in a representation with a high norm. Conversely, if an input primarily falls outside (orthogonal to) these directions}, its representation norm will approach zero, behaving like noise that the network cannot effectively utilize. Building on this, we propose using the norms of rationale candidates as an alternative objective to MMI. Through experiments on four text classification datasets and one graph classification dataset using three network architectures (GRUs, BERT, and GCN), we show that our method outperforms MMI and its improved variants in identifying better rationales. We also compare our method with a representative LLM (llama-3.1-8b-instruct) and find that our simple method gets comparable results to it and can sometimes even outperform it.

Paper Structure

This paper contains 29 sections, 1 theorem, 27 equations, 10 figures, 9 tables.

Key Result

Lemma 1

Let $U$ and $V$ be two random points on a $p$-dimensional unit hypersphere $\mathbb{R}^p$, and $O$ is the origin. Let $\Theta$ be the angle between vector $\overrightarrow{OU}$ and vector $\overrightarrow{OV}$, then for all $p\geq 2$ and $\epsilon \in (0,\pi/2)$, where $K$ is a universal constant.

Figures (10)

  • Figure 1: The standard rationalization framework RNP. The task is binary sentiment classification about the hotel's location. $X,Z,\hat{Y},Y$ represent the input, the extracted rationale candidate, the prediction and the ground truth label, respectively. $M$ is a sequence of binary masks. $Enc(Z)$ is the encoder's final layer representation (like the term $"$embedding" in emb1emb2). $\theta_E,\theta_P$ represent the parameters of the extractor and the predictor. $H_c$ denotes cross-entropy.
  • Figure 2: The diminishing marginal returns in Sigmoid function.
  • Figure 3: The (a) prediction accuracy, (b) cross-entropy loss, and (c) the norm of the representation (i.e., $||Enc(Z)||_2$) through the neural network vary with the proportion of true rationale components in the rationale candidate input within a trained standard RNP predictor. The dataset is Beer-Aroma. The results of more datasets are shown in Appendix \ref{['app: marginal return']}.
  • Figure 4: The comparison between vanilla MMI, N2R, and N2R+MMI on the datasets from (a) BeerAdvocate benchmark and (b) HotelReviews benchmark.
  • Figure 5: An example of llama's output. Here "1" means that the class label $Y$ is positive. And the words after "$|$" represent the rationale.
  • ...and 5 more figures

Theorems & Definitions (1)

  • Lemma 1: highdimvec