Table of Contents
Fetching ...

Interpretive Efficiency: Information-Geometric Foundations of Data Usefulness

Ronald Katende

TL;DR

This work introduces Interpretive Efficiency, $E(\varphi;N)$, a principled, axiomatic measure of how well task-relevant information passes through an interpretive channel. By linking $E$ to mutual information and Fisher information, the authors establish a rigorous framework with five guiding axioms, provide finite-sample estimation guarantees, and connect to the variational information bottleneck via V-GIB compatibility. Theoretical results (including a local Fisher--geometric expansion) are complemented by controlled synthetic examples and empirical validation on digits and spectral signals, illustrating when compression preserves or degrades interpretive usefulness beyond raw accuracy. Practically, $E$ serves as a diagnostic tool for representation design, highlighting redundancy, robustness, and the geometry of information flow that underpins reliable, interpretable reasoning.

Abstract

Interpretability is central to trustworthy machine learning, yet existing metrics rarely quantify how effectively data support an interpretive representation. We propose Interpretive Efficiency, a normalized, task-aware functional that measures the fraction of task-relevant information transmitted through an interpretive channel. The definition is grounded in five axioms ensuring boundedness, Blackwell-style monotonicity, data-processing stability, admissible invariance, and asymptotic consistency. We relate the functional to mutual information and derive a local Fisher-geometric expansion, then establish asymptotic and finite-sample estimation guarantees using standard empirical-process tools. Experiments on controlled image and signal tasks demonstrate that the measure recovers theoretical orderings, exposes representational redundancy masked by accuracy, and correlates with robustness, making it a practical, theory-backed diagnostic for representation design.

Interpretive Efficiency: Information-Geometric Foundations of Data Usefulness

TL;DR

This work introduces Interpretive Efficiency, , a principled, axiomatic measure of how well task-relevant information passes through an interpretive channel. By linking to mutual information and Fisher information, the authors establish a rigorous framework with five guiding axioms, provide finite-sample estimation guarantees, and connect to the variational information bottleneck via V-GIB compatibility. Theoretical results (including a local Fisher--geometric expansion) are complemented by controlled synthetic examples and empirical validation on digits and spectral signals, illustrating when compression preserves or degrades interpretive usefulness beyond raw accuracy. Practically, serves as a diagnostic tool for representation design, highlighting redundancy, robustness, and the geometry of information flow that underpins reliable, interpretable reasoning.

Abstract

Interpretability is central to trustworthy machine learning, yet existing metrics rarely quantify how effectively data support an interpretive representation. We propose Interpretive Efficiency, a normalized, task-aware functional that measures the fraction of task-relevant information transmitted through an interpretive channel. The definition is grounded in five axioms ensuring boundedness, Blackwell-style monotonicity, data-processing stability, admissible invariance, and asymptotic consistency. We relate the functional to mutual information and derive a local Fisher-geometric expansion, then establish asymptotic and finite-sample estimation guarantees using standard empirical-process tools. Experiments on controlled image and signal tasks demonstrate that the measure recovers theoretical orderings, exposes representational redundancy masked by accuracy, and correlates with robustness, making it a practical, theory-backed diagnostic for representation design.

Paper Structure

This paper contains 59 sections, 16 theorems, 84 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

Under the normalization in Definition def:IE, using either the ratio or the calibrated-difference form with finite positive reference terms, the efficiency satisfies $0\le E(\varphi;N)\le 1$ for all admissible $\varphi$ and all $N$.

Figures (5)

  • Figure 1: Digits. Efficiency and accuracy follow the expected ordering identity $>$ PCA-16 $>$ random projection. Accuracy remains high even when $E$ declines, revealing redundancy.
  • Figure 2: Sinusoids. FFT preserves the discriminative spectral structure and attains high $E$. Downsampling discards key frequencies and reduces $E$ and accuracy. Random projection is intermediate.
  • Figure 3: Clean CV accuracy vs. representation dimension.
  • Figure :
  • Figure :

Theorems & Definitions (47)

  • Definition 1: Interpretive Efficiency
  • Definition 2: Axioms for $E$
  • Proposition 1: Boundedness
  • proof : Sketch
  • Proposition 2: Continuity and semicontinuity
  • proof : Sketch
  • Proposition 3: Monotonicity and data-processing analogue
  • proof : Sketch
  • Proposition 4: Transformation invariances
  • proof : Sketch
  • ...and 37 more