Table of Contents
Fetching ...

Recursive Self-Similarity in Deep Weight Spaces of Neural Architectures: A Fractal and Coarse Geometry Perspective

Ambarish Moharil, Indika Kumara, Damian Andrew Tamburri, Majid Mohammadi, Willem-Jan van den Heuvel

TL;DR

The paper addresses how neural weight spaces exhibit multi-scale, self-similar structure by introducing a fractal transformation $T_{r_k}$, a coarse group action of $(\mathbb{Z},+)$ on discrete weight grids. It develops a rigorous framework combining fractal geometry and coarse geometry to analyze weight matrices across scales, proving properties such as composition, linearity, coarse properness, and asymptotic invertibility, and establishing a Discrete Scale Invariance (DSI) criterion with fractal dimension $D$ that can exceed the topological dimension $D_T$. The approach also integrates activation intertwiner groups to ensure scale-consistent behavior of activations and demonstrates permutation-invariance at the global segmentation level with weak local equivariance. Experiments on CIFAR-10 with CNNs reveal architecture-dependent fractal dynamics, supporting the notion that deep networks encode hierarchical, self-similar representations that can be probed via fractal segmentation. Overall, the work provides a principled, scalable lens for multi-scale neural architecture analysis, with potential implications for design, interpretation, and optimization of deep models.

Abstract

This paper conceptualizes the Deep Weight Spaces (DWS) of neural architectures as hierarchical, fractal-like, coarse geometric structures observable at discrete integer scales through recursive dilation. We introduce a coarse group action termed the fractal transformation, $T_{r_k} $, acting under the symmetry group $G = (\mathbb{Z}, +) $, to analyze neural parameter matrices or tensors, by segmenting the underlying discrete grid $Ω$ into $N(r_k)$ fractals across varying observation scales $ r_k $. This perspective adopts a box count technique, commonly used to assess the hierarchical and scale-related geometry of physical structures, which has been extensively formalized under the topic of fractal geometry. We assess the structural complexity of neural layers by estimating the Hausdorff-Besicovitch dimension of their layers and evaluating a degree of self-similarity. The fractal transformation features key algebraic properties such as linearity, identity, and asymptotic invertibility, which is a signature of coarse structures. We show that the coarse group action exhibits a set of symmetries such as Discrete Scale Invariance (DSI) under recursive dilation, strong invariance followed by weak equivariance to permutations, alongside respecting the scaling equivariance of activation functions, defined by the intertwiner group relations. Our framework targets large-scale structural properties of DWS, deliberately overlooking minor inconsistencies to focus on significant geometric characteristics of neural networks. Experiments on CIFAR-10 using ResNet-18, VGG-16, and a custom CNN validate our approach, demonstrating effective fractal segmentation and structural analysis.

Recursive Self-Similarity in Deep Weight Spaces of Neural Architectures: A Fractal and Coarse Geometry Perspective

TL;DR

The paper addresses how neural weight spaces exhibit multi-scale, self-similar structure by introducing a fractal transformation , a coarse group action of on discrete weight grids. It develops a rigorous framework combining fractal geometry and coarse geometry to analyze weight matrices across scales, proving properties such as composition, linearity, coarse properness, and asymptotic invertibility, and establishing a Discrete Scale Invariance (DSI) criterion with fractal dimension that can exceed the topological dimension . The approach also integrates activation intertwiner groups to ensure scale-consistent behavior of activations and demonstrates permutation-invariance at the global segmentation level with weak local equivariance. Experiments on CIFAR-10 with CNNs reveal architecture-dependent fractal dynamics, supporting the notion that deep networks encode hierarchical, self-similar representations that can be probed via fractal segmentation. Overall, the work provides a principled, scalable lens for multi-scale neural architecture analysis, with potential implications for design, interpretation, and optimization of deep models.

Abstract

This paper conceptualizes the Deep Weight Spaces (DWS) of neural architectures as hierarchical, fractal-like, coarse geometric structures observable at discrete integer scales through recursive dilation. We introduce a coarse group action termed the fractal transformation, , acting under the symmetry group , to analyze neural parameter matrices or tensors, by segmenting the underlying discrete grid into fractals across varying observation scales . This perspective adopts a box count technique, commonly used to assess the hierarchical and scale-related geometry of physical structures, which has been extensively formalized under the topic of fractal geometry. We assess the structural complexity of neural layers by estimating the Hausdorff-Besicovitch dimension of their layers and evaluating a degree of self-similarity. The fractal transformation features key algebraic properties such as linearity, identity, and asymptotic invertibility, which is a signature of coarse structures. We show that the coarse group action exhibits a set of symmetries such as Discrete Scale Invariance (DSI) under recursive dilation, strong invariance followed by weak equivariance to permutations, alongside respecting the scaling equivariance of activation functions, defined by the intertwiner group relations. Our framework targets large-scale structural properties of DWS, deliberately overlooking minor inconsistencies to focus on significant geometric characteristics of neural networks. Experiments on CIFAR-10 using ResNet-18, VGG-16, and a custom CNN validate our approach, demonstrating effective fractal segmentation and structural analysis.

Paper Structure

This paper contains 23 sections, 108 equations, 2 figures, 4 algorithms.

Figures (2)

  • Figure 1: The top-left figure (a) shows the log-log plot for estimating the fractal dimension $D$ of the fully connected layers in ResNet-18, VGG-16, and SimpleCNN, analyzed across $\lambda = \{2, 3, 5, 7\}$. The top-right figure (b) depicts the $D$ estimation for ResNet-18’s convolutional layers for $\lambda = \{2, 3, 5, 7, 9\}$. Consistent $D$ values across scales indicate self-similarity and structural invariance, with convergence towards $D = 2$ reflecting hierarchical abstraction in deeper layers.
  • Figure 2: plots (a) and (b) show the $D$ estimations for SimpleCNN and VGG-16, respectively. Consistent $D$ values across scales indicate self-similarity and structural invariance, with convergence towards $D = 2$ reflecting hierarchical abstraction in deeper layers.