Table of Contents
Fetching ...

HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability

Sudhanva Manjunath Athreya, Paul Rosen

TL;DR

HOLE presents a global, topology-based framework for neural network interpretability by applying persistent homology to layer activations and visualizing the evolution of activation space structure. It deploys multiple distance metrics and three visualization modalities (Sankey diagrams, heatmap dendrograms, blob graphs) to reveal class separation, feature disentanglement, and robustness across layers and architectures. Through CIFAR-10 experiments on ResNet and ViT models, HOLE demonstrates how topological signatures shift with noise and compression, providing insights beyond accuracy metrics. The work contributes a model-agnostic workflow, quantitative topological tasks (Hierarchy, Separability, Homogeneity, Outliers), and practical tools to diagnose and guide robust, efficient neural network design.

Abstract

Deep learning models have achieved remarkable success across various domains, yet their learned representations and decision-making processes remain largely opaque and hard to interpret. This work introduces HOLE (Homological Observation of Latent Embeddings), a method for analyzing and interpreting deep neural networks through persistent homology. HOLE extracts topological features from neural activations and presents them using a suite of visualization techniques, including Sankey diagrams, heatmaps, dendrograms, and blob graphs. These tools facilitate the examination of representation structure and quality across layers. We evaluate HOLE on standard datasets using a range of discriminative models, focusing on representation quality, interpretability across layers, and robustness to input perturbations and model compression. The results indicate that topological analysis reveals patterns associated with class separation, feature disentanglement, and model robustness, providing a complementary perspective for understanding and improving deep learning systems.

HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability

TL;DR

HOLE presents a global, topology-based framework for neural network interpretability by applying persistent homology to layer activations and visualizing the evolution of activation space structure. It deploys multiple distance metrics and three visualization modalities (Sankey diagrams, heatmap dendrograms, blob graphs) to reveal class separation, feature disentanglement, and robustness across layers and architectures. Through CIFAR-10 experiments on ResNet and ViT models, HOLE demonstrates how topological signatures shift with noise and compression, providing insights beyond accuracy metrics. The work contributes a model-agnostic workflow, quantitative topological tasks (Hierarchy, Separability, Homogeneity, Outliers), and practical tools to diagnose and guide robust, efficient neural network design.

Abstract

Deep learning models have achieved remarkable success across various domains, yet their learned representations and decision-making processes remain largely opaque and hard to interpret. This work introduces HOLE (Homological Observation of Latent Embeddings), a method for analyzing and interpreting deep neural networks through persistent homology. HOLE extracts topological features from neural activations and presents them using a suite of visualization techniques, including Sankey diagrams, heatmaps, dendrograms, and blob graphs. These tools facilitate the examination of representation structure and quality across layers. We evaluate HOLE on standard datasets using a range of discriminative models, focusing on representation quality, interpretability across layers, and robustness to input perturbations and model compression. The results indicate that topological analysis reveals patterns associated with class separation, feature disentanglement, and model robustness, providing a complementary perspective for understanding and improving deep learning systems.

Paper Structure

This paper contains 41 sections, 4 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Example (left) persistence diagram and (rght) barcode.
  • Figure 2: HOLE overview shows how during inference, neural network activations are extracted via forward hooks. These activations are passed as point clouds to the persistent homology function, and then the filtration process is visualized.
  • Figure 3: (o) The input dataset was used to generate (a-d,i-k) distance heatmaps and (e-h,l-n) MDS visualizations showing pairwise distances between the data points using various metrics.
  • Figure 4: Examples of the visualizations used to support tasks [T1]-[T4] using persistent homology.
  • Figure 5: Comparison of (a-d) Sankey diagrams for ViT encoder layers 10 and 11 using different distance metrics and (e-f) heatmap dendrograms between layers 0 and 11.
  • ...and 4 more figures