Understanding Sparse Neural Networks from their Topology via Multipartite Graph Representations

Elia Cunegatti; Matteo Farina; Doina Bucur; Giovanni Iacca

Understanding Sparse Neural Networks from their Topology via Multipartite Graph Representations

Elia Cunegatti, Matteo Farina, Doina Bucur, Giovanni Iacca

TL;DR

This paper investigates how to characterize Sparse Neural Networks (SNNs) by their topology to predict performance after pruning at initialization. They critique existing, input-agnostic, layer-based graph encodings and Ramanujan-based metrics, and introduce an unrolled input-aware Multipartite Graph Encoding (MGE) that captures end-to-end processing by linking successive layer graphs and accounting for input dimensionality. They define a suite of topological metrics (topometrics) spanning local to expansion properties and demonstrate that these end-to-end metrics better predict accuracy drop and rank pruning-at-initialization (PaI) methods across multiple CNN architectures and datasets. A key finding is that Ramanujan-based metrics are often no more informative than simple layer density, and that a mixture of topometrics provides the strongest predictive power, guiding practical PaI selection. The work provides a publicly available codebase and suggests that end-to-end, data-aware graph representations can more effectively inform SNN design and pruning strategies in real-world settings.

Abstract

Pruning-at-Initialization (PaI) algorithms provide Sparse Neural Networks (SNNs) which are computationally more efficient than their dense counterparts, and try to avoid performance degradation. While much emphasis has been directed towards \emph{how} to prune, we still do not know \emph{what topological metrics} of the SNNs characterize \emph{good performance}. From prior work, we have layer-wise topological metrics by which SNN performance can be predicted: the Ramanujan-based metrics. To exploit these metrics, proper ways to represent network layers via Graph Encodings (GEs) are needed, with Bipartite Graph Encodings (BGEs) being the \emph{de-facto} standard at the current stage. Nevertheless, existing BGEs neglect the impact of the inputs, and do not characterize the SNN in an end-to-end manner. Additionally, thanks to a thorough study of the Ramanujan-based metrics, we discover that they are only as good as the \emph{layer-wise density} as performance predictors, when paired with BGEs. To close both gaps, we design a comprehensive topological analysis for SNNs with both linear and convolutional layers, via (i) a new input-aware Multipartite Graph Encoding (MGE) for SNNs and (ii) the design of new end-to-end topological metrics over the MGE. With these novelties, we show the following: (a) The proposed MGE allows to extract topological metrics that are much better predictors of the accuracy drop than metrics computed from current input-agnostic BGEs; (b) Which metrics are important at different sparsity levels and for different architectures; (c) A mixture of our topological metrics can rank PaI algorithms more effectively than Ramanujan-based metrics. The codebase is publicly available at https://github.com/eliacunegatti/mge-snn.

Understanding Sparse Neural Networks from their Topology via Multipartite Graph Representations

TL;DR

Abstract

Paper Structure (47 sections, 5 equations, 5 figures, 12 tables)

This paper contains 47 sections, 5 equations, 5 figures, 12 tables.

Introduction
Contributions.
Related Work
Pruning at Initialization (PaI).
Graph Representation of SNNs and DNNs.
Methodology
Bipartite Graph Encoding (BGE)
Linear Layers.
Convolutional Layers.
Multipartite Graph Encoding (MGE)
Topological Metrics
Local Connectivity.
Neighbor Connectivity.
Strength Connectivity.
Global Connectivity.
...and 32 more sections

Figures (5)

Figure 1: Illustration of the proposed unrolled input-aware BGE with $\mathcal{I} = 3 \times 3 \times 3$ and convolutional parameters $(C_{\textit{out}} = 2, C_{\textit{in}} = 3,w_{\textit{ker}} = 2, h_{\textit{ker}} = 2, P=0, S=1)$. (I) and (II) show, respectively, the first and second convolutional steps and how the graph edges are generated assuming that all the kernel parameters are unmasked. (III) and (IV), respectively, show the complete graph representation after all the convolutional steps have been done in both the (III) dense and (IV) sparse cases.
Figure 2: Comparison between the Ramanujan-based metrics and the Layer-Density. To enable a visual comparison, we scale all values by their overall sum across layers s.t. they sum to 1.
Figure 3: Pearson correlation coefficients between $\downarrow acc$ and each topometric (Sparsity-Fixed scenario).
Figure 4: Pearson correlation coefficients between $\downarrow acc$ and each topometric (Architecture-Fixed scenario).
Figure 5: Pearson Correlation ($r$) between Ramanujan-based metrics and Layer-Density

Understanding Sparse Neural Networks from their Topology via Multipartite Graph Representations

TL;DR

Abstract

Understanding Sparse Neural Networks from their Topology via Multipartite Graph Representations

Authors

TL;DR

Abstract

Table of Contents

Figures (5)