Characterization of topological structures in different neural network architectures
Paweł Świder
TL;DR
This work addresses understanding neural network internals by applying Topological Data Analysis (TDA) to layer activations, using persistent homology ($H_k$) and Betti numbers ($\beta_k$) on Vietoris–Rips complexes to quantify topology across ResNet, VGG19, and ViT. The authors propose a practical workflow, investigate the effects of sample size and outliers on persistence diagrams (noting a threshold around a few hundred points and generally modest LOF impact), and compare topological features across architectures and layers, including finetuning effects. Key findings include architecture-dependent topology with deeper layers often undergoing stronger topological transformations, shared topological tendencies among similarly structured models, and ViT displaying distinctive, outlier-rich, and mid-to-late-layer divergence between pre-trained and finetuned representations. By demonstrating consistent train/test topology and outlining guidelines for fair diagram comparisons, the study validates TDA as a powerful tool for interpreting neural representations and guiding future topology-based analyses across diverse architectures.
Abstract
One of the most crucial tasks in the future will be to understand what is going on in neural networks, as they will become even more powerful and widely deployed. This work aims to use TDA methods to analyze neural representations. We develop methods for analyzing representations from different architectures and check how one should use them to obtain valid results. Our findings indicate that removing outliers does not have much impact on the results and that we should compare representations with the same number of elements. We applied these methods for ResNet, VGG19, and ViT architectures and found substantial differences along with some similarities. Additionally, we determined that models with similar architecture tend to have a similar topology of representations and models with a larger number of layers change their topology more smoothly. Furthermore, we found that the topology of pre-trained and finetuned models starts to differ in the middle and final layers while remaining quite similar in the initial layers. These findings demonstrate the efficacy of TDA in the analysis of neural network behavior.
