On Space Folds of ReLU Neural Networks
Michal Lewandowski, Hamid Eghbalzadeh, Bernhard Heinzl, Raphael Pisoni, Bernhard A. Moser
TL;DR
The paper introduces a quantitative framework to study space folding in ReLU networks by mapping straight input lines into the activation space and measuring convexity deviations with a new space folding measure. It proves equivalence of convexity notions between input and activation spaces for certain hyperplanes, and defines range-based metrics that quantify folding via the measure $\chi(\Gamma)$, with a global bound $\Phi_{\mathcal{N}}$. Empirical analyses on CantorNet and MNIST demonstrate folding phenomena that intensify with network depth and relate to generalization, revealing a structured, self-similar geometry in activation space. The approach provides a novel lens to understand how neural networks transform and compress input data, with potential extensions to other architectures, normalization schemes, and learning settings. Overall, the work lays groundwork for interpreting activation patterns through geometric folding, offering a tool for characterizing and comparing neural representations."
Abstract
Recent findings suggest that the consecutive layers of ReLU neural networks can be understood geometrically as space folding transformations of the input space, revealing patterns of self-similarity. In this paper, we present the first quantitative analysis of this space folding phenomenon in ReLU neural networks. Our approach focuses on examining how straight paths in the Euclidean input space are mapped to their counterparts in the Hamming activation space. In this process, the convexity of straight lines is generally lost, giving rise to non-convex folding behavior. To quantify this effect, we introduce a novel measure based on range metrics, similar to those used in the study of random walks, and provide the proof for the equivalence of convexity notions between the input and activation spaces. Furthermore, we provide empirical analysis on a geometrical analysis benchmark (CantorNet) as well as an image classification benchmark (MNIST). Our work advances the understanding of the activation space in ReLU neural networks by leveraging the phenomena of geometric folding, providing valuable insights on how these models process input information.
