Table of Contents
Fetching ...

Injectivity of ReLU-layers: Tools from Frame Theory

Daniel Haider, Martin Ehler, Peter Balazs

TL;DR

This work addresses the injectivity of a single ReLU layer $C_\alpha(x)=\text{ReLU}(Cx-\alpha)$ by casting the problem in frame theory. It introduces the notion of $\alpha$-rectifying frames and analyzes injectivity on bounded domains $K$, linking input geometry and bias to reconstructability. The paper provides two practical methods to compute maximal bias—Approach A based on most correlated bases and Approach B via inscribing polytopes (PBE)—and develops duality-based reconstruction formulas via the frame algorithm, with stability considerations. Together, these results yield a concrete framework to study information loss in ReLU layers and to perform input reconstruction, with direct applicability to inverse problems and interpretable neural network design. The methods blend rigorous geometric, combinatorial, and algorithmic tools to render the injectivity problem tractable in practical, bounded-domain settings.

Abstract

Injectivity is the defining property of a mapping that ensures no information is lost and any input can be perfectly reconstructed from its output. By performing hard thresholding, the ReLU function naturally interferes with this property, making the injectivity analysis of ReLU layers in neural networks a challenging yet intriguing task that has not yet been fully solved. This article establishes a frame theoretic perspective to approach this problem. The main objective is to develop a comprehensive characterization of the injectivity behavior of ReLU layers in terms of all three involved ingredients: (i) the weights, (ii) the bias, and (iii) the domain where the data is drawn from. Maintaining a focus on practical applications, we limit our attention to bounded domains and present two methods for numerically approximating a maximal bias for given weights and data domains. These methods provide sufficient conditions for the injectivity of a ReLU layer on those domains and yield a novel practical methodology for studying the information loss in ReLU layers. Finally, we derive explicit reconstruction formulas based on the duality concept from frame theory.

Injectivity of ReLU-layers: Tools from Frame Theory

TL;DR

This work addresses the injectivity of a single ReLU layer by casting the problem in frame theory. It introduces the notion of -rectifying frames and analyzes injectivity on bounded domains , linking input geometry and bias to reconstructability. The paper provides two practical methods to compute maximal bias—Approach A based on most correlated bases and Approach B via inscribing polytopes (PBE)—and develops duality-based reconstruction formulas via the frame algorithm, with stability considerations. Together, these results yield a concrete framework to study information loss in ReLU layers and to perform input reconstruction, with direct applicability to inverse problems and interpretable neural network design. The methods blend rigorous geometric, combinatorial, and algorithmic tools to render the injectivity problem tractable in practical, bounded-domain settings.

Abstract

Injectivity is the defining property of a mapping that ensures no information is lost and any input can be perfectly reconstructed from its output. By performing hard thresholding, the ReLU function naturally interferes with this property, making the injectivity analysis of ReLU layers in neural networks a challenging yet intriguing task that has not yet been fully solved. This article establishes a frame theoretic perspective to approach this problem. The main objective is to develop a comprehensive characterization of the injectivity behavior of ReLU layers in terms of all three involved ingredients: (i) the weights, (ii) the bias, and (iii) the domain where the data is drawn from. Maintaining a focus on practical applications, we limit our attention to bounded domains and present two methods for numerically approximating a maximal bias for given weights and data domains. These methods provide sufficient conditions for the injectivity of a ReLU layer on those domains and yield a novel practical methodology for studying the information loss in ReLU layers. Finally, we derive explicit reconstruction formulas based on the duality concept from frame theory.
Paper Structure (17 sections, 28 theorems, 84 equations, 9 figures, 5 algorithms)

This paper contains 17 sections, 28 theorems, 84 equations, 9 figures, 5 algorithms.

Key Result

Theorem 1

Let $\Phi=(\phi_i)_{i=1}^m$ be a frame for $\mathbb{R}^n$, $\alpha \in \mathbb{R}^m$, and $\emptyset\neq K\subseteq \mathbb{R}^n$. Under the assumptions that $\Phi$ includes a unique most correlated basis everywhere (Def def:mostcorr), $K$ is open or strictly convex, and bias-exact for $\Phi$ (Def. For any bias $\alpha$ the maximal domain $\mathcal{K}_{\alpha}^*$ can be constructed explicitly as

Figures (9)

  • Figure 1: Illustrations of the notions $I_x^{\alpha}$ and $\Omega_i^\alpha$ related to active frame elements on $\mathbb{B}$. The frame $(\phi_1,\phi_2)$ in the most right example is $\alpha$-rectifying on $K$ if $K\subseteq (\Omega_1^\alpha\cap \Omega_2^\alpha)$.
  • Figure 2: Left: The frame composed of the standard basis and its negative elements is $\mathbf{0}$-rectifying on $\mathbb{R}^2$. Mid: The standard basis is $\mathbf{0}$-rectifying on $\mathbb{R}^2_+$ and $(-\mathbf{1})$-rectifying on $\mathbb{B}$. Right: The triangle frame is $(-\mathbf{\frac{1}{2}})$-rectifying on $\mathbb{B}$, but never on $\mathbb{R}^2$ since there will always be cones where only one element is active (lighter areas).
  • Figure 3: The dark areas in the left and mid picture indicate the maximal domains $\mathcal{K}_{\alpha}^*$ for the triangle frame with zero bias (left), and a normalized random frame with random bias (mid). The right illustration corresponds to Example \ref{['ex:a']}. We point out how $K_2 = \{x\in K: 2\in J^*(x)\}$ looks like, where $J^*(x)$ is the most correlated basis for $x$, see Definition \ref{['def:mostcorr']}.
  • Figure 4: The three subplots show different decompositions of the sphere in $\mathbb{R}^3$. The black dots indicate the $m=12$ frame elements of a random frame on $\mathbb{S}$. From left to right: The facets $F_j$ of the inscribing polytope $P_\Phi$, the associated spherical caps $F_j^\mathbb{S}=\operatorname{cone}(F_j)\cap \mathbb{S}$, and the spherical patches associated to different most correlated bases obtained by $J^*(x)$. While the facets provide a very intuitive and simple decomposition into sub-frames, the decomposition via the most correlated bases minimizes the correlation directly.
  • Figure 5: For the Icosahedron frame we have that $x\in F_j \Leftrightarrow I_{F_j} = J^*(x)$ (left). For less regular frames, this does not hold anymore (mid). The right picture illustrates the PBE on $\mathbb{B}$ for the Icosahedron frame. To get $\left(\alpha_{\mathbb{B}}^\Delta\right)_2$ the infima are taken over the points in the conical parts (dark gray area) for all adjacent facets of $\phi_2$ (blue).
  • ...and 4 more figures

Theorems & Definitions (61)

  • Theorem
  • Definition 2.1: ReLU layer
  • Definition 2.2: $\alpha$-rectifying frames
  • Proposition 2.3
  • Theorem 2.4: Injectivity of ReLU layers I
  • Theorem 2.5: Injectivity of ReLU layers II
  • proof : Proof of Theorem \ref{['reluinj1']}
  • proof : Proof of Theorem \ref{['reluinj0']}
  • Lemma 2.6
  • Corollary 2.7
  • ...and 51 more