The Geometry of the Set of Equivalent Linear Neural Networks
Jonathan Richard Shewchuk, Sagnik Bhattacharya
TL;DR
This work provides a comprehensive geometric and topological framework for the fiber mu^{-1}(W) of a linear neural network, i.e., all weight factorizations yielding a fixed linear map W. By introducing a rank-based stratification and a rich system of subspaces (A_{kji}, B_{kji}, prebases, and basis-flow diagrams), the authors characterize tangent and normal spaces to each stratum, connect strata via rank-1 abstract moves, and prove that each stratum is a C^ abla^ty manifold with dimension computable from the rank list. A canonical weight construction and a forward/transpose duality provide a powerful, geometry-driven view of information flow through the network and its impact on optimization paths. The paper also develops a detailed, algorithmic approach to building the stratum DAG, enabling practical exploration of fiber geometry and potential training dynamics improvements. Overall, the rank stratification offers a principled lens for understanding how different network decompositions realize the same linear map and how gradient-based optimization might traverse or avoid spurious critical regions.
Abstract
We characterize the geometry and topology of the set of all weight vectors for which a linear neural network computes the same linear transformation $W$. This set of weight vectors is called the fiber of $W$ (under the matrix multiplication map), and it is embedded in the Euclidean weight space of all possible weight vectors. The fiber is an algebraic variety that is not necessarily a manifold. We describe a natural way to stratify the fiber--that is, to partition the algebraic variety into a finite set of manifolds of varying dimensions called strata. We call this set of strata the rank stratification. We derive the dimensions of these strata and the relationships by which they adjoin each other. Although the strata are disjoint, their closures are not. Our strata satisfy the frontier condition: if a stratum intersects the closure of another stratum, then the former stratum is a subset of the closure of the latter stratum. Each stratum is a manifold of class $C^\infty$ embedded in weight space, so it has a well-defined tangent space and normal space at every point (weight vector). We show how to determine the subspaces tangent to and normal to a specified stratum at a specified point on the stratum, and we construct elegant bases for those subspaces. To help achieve these goals, we first derive what we call a Fundamental Theorem of Linear Neural Networks, analogous to what Strang calls the Fundamental Theorem of Linear Algebra. We show how to decompose each layer of a linear neural network into a set of subspaces that show how information flows through the neural network. Each stratum of the fiber represents a different pattern by which information flows (or fails to flow) through the neural network. The topology of a stratum depends solely on this decomposition. So does its geometry, up to a linear transformation in weight space.
