Table of Contents
Fetching ...

Minimal Sufficient Representations for Self-interpretable Deep Neural Networks

Zhiyao Tan, Liu Li, Huazhen Lin

Abstract

Deep neural networks (DNNs) achieve remarkable predictive performance but remain difficult to interpret, largely due to overparameterization that obscures the minimal structure required for interpretation. Here we introduce DeepIn, a self-interpretable neural network framework that adaptively identifies and learns the minimal representation necessary for preserving the full expressive capacity of standard DNNs. We show that DeepIn can correctly identify the minimal representation dimension, select relevant variables, and recover the minimal sufficient network architecture for prediction. The resulting estimator achieves optimal non-asymptotic error rates that adapt to the learned minimal dimension, demonstrating that recovering minimal sufficient structure fundamentally improves generalization error. Building on these guarantees, we further develop hypothesis testing procedures for both selected variables and learned representations, bridging deep representation learning with formal statistical inference. Across biomedical and vision benchmarks, DeepIn improves both predictive accuracy and interpretability, reducing error by up to 30% on real-world datasets while automatically uncovering human-interpretable discriminative patterns. Our results suggest that interpretability and statistical rigor can be embedded directly into deep architectures without sacrificing performance.

Minimal Sufficient Representations for Self-interpretable Deep Neural Networks

Abstract

Deep neural networks (DNNs) achieve remarkable predictive performance but remain difficult to interpret, largely due to overparameterization that obscures the minimal structure required for interpretation. Here we introduce DeepIn, a self-interpretable neural network framework that adaptively identifies and learns the minimal representation necessary for preserving the full expressive capacity of standard DNNs. We show that DeepIn can correctly identify the minimal representation dimension, select relevant variables, and recover the minimal sufficient network architecture for prediction. The resulting estimator achieves optimal non-asymptotic error rates that adapt to the learned minimal dimension, demonstrating that recovering minimal sufficient structure fundamentally improves generalization error. Building on these guarantees, we further develop hypothesis testing procedures for both selected variables and learned representations, bridging deep representation learning with formal statistical inference. Across biomedical and vision benchmarks, DeepIn improves both predictive accuracy and interpretability, reducing error by up to 30% on real-world datasets while automatically uncovering human-interpretable discriminative patterns. Our results suggest that interpretability and statistical rigor can be embedded directly into deep architectures without sacrificing performance.
Paper Structure (15 sections, 6 theorems, 18 equations, 5 figures, 4 tables, 2 algorithms)

This paper contains 15 sections, 6 theorems, 18 equations, 5 figures, 4 tables, 2 algorithms.

Key Result

Theorem 1

For some $0\leq\nu_2<\nu_1\leq 1/2$, define $\Delta_{net,n}:=Cn^{-1/2}+D\log n\cdot \eta_{net}^2 \mathcal{T}_0^{1-2\nu_2}$, where $C$ and $D$ are positive constants. Under Supplementary Assumptions 1-5, if for some constants $C_1, C_2$ and $C$, where $\lambda_m:=\lambda_1\vee\lambda_2\vee\lambda_3$ with $\lambda_j=R_j(t)$, $\boldsymbol{\rho}_m(\hat{\boldsymbol{\mu}}_{\mathcal{A}^c}):=(\sum\limits

Figures (5)

  • Figure 1: Test power for informative variables and empirical size for noisy variables in Settings 1–4 under correlation levels $\rho=0.2$.
  • Figure 2: The distribution under $H_0$ and the power of different representations with their singular value in Setting 3 ($\rho=0.2$): (a) The distribution under $H_0$; (b) The power of representations; (c) The singular value of representations.
  • Figure 3: First column: The digit 5 and 6. Second column: Important pixels for predicting digit 5. Last column: Important pixels for predicting digit 6.
  • Figure 4: First column: The image T-shirt and sneaker. Second column: Important pixels for predicting T-shirt. Last column: Important pixels for predicting sneaker.
  • Figure : Parameter truncation algorithm

Theorems & Definitions (8)

  • Definition 1
  • Definition 2: Selection consistency
  • Theorem 1: Selection consistency
  • Theorem 2: Non-asymptotic error bound
  • Corollary 1: Optimal non-asymptotic error bound
  • Theorem 3
  • Theorem 4: Functional normality
  • Theorem 5: Asymptotic normality of test statistic