Linear Independence of Generalized Neurons and Related Functions

Leyang Zhang

Linear Independence of Generalized Neurons and Related Functions

Leyang Zhang

Abstract

The linear independence of neurons plays a significant role in theoretical analysis of neural networks. Specifically, given neurons $H_1, ..., H_n: \bR^N \times \bR^d \to \bR$, we are interested in the following question: when are $\{H_1(θ_1, \cdot), ..., H_n(θ_n, \cdot)\}$ are linearly independent as the parameters $θ_1, ..., θ_n$ of these functions vary over $\bR^N$. Previous works give a complete characterization of two-layer neurons without bias, for generic smooth activation functions. In this paper, we study the problem for neurons with arbitrary layers and widths, giving a simple but complete characterization for generic analytic activation functions.

Linear Independence of Generalized Neurons and Related Functions

Abstract

The linear independence of neurons plays a significant role in theoretical analysis of neural networks. Specifically, given neurons

, we are interested in the following question: when are

are linearly independent as the parameters

of these functions vary over

. Previous works give a complete characterization of two-layer neurons without bias, for generic smooth activation functions. In this paper, we study the problem for neurons with arbitrary layers and widths, giving a simple but complete characterization for generic analytic activation functions.

Paper Structure (13 sections, 31 theorems, 114 equations, 4 figures)

This paper contains 13 sections, 31 theorems, 114 equations, 4 figures.

Introduction
Notations and Assumptions
Preparing Lemmas and Propositions
Functions of Ordered Growth
Local Asymptotics of Parameterized Functions
Analytic Bump Functions
Theory of General Neurons
ZSigma for Activations Vanishing at Origin
Misc
Generic Activations are not Enough
Two-layer Neurons With Bias
Three-layer Neurons with Sigmoid and Tanh Activation
Discussion and Conclusion

Key Result

Lemma 3.1

Fix $m \in \mathbb{N}$. Given distinct $w_1, ..., w_m \in \mathbb{R}^d$. Then there is a $v \in \partial B(0,1) \subseteq \mathbb{R}^d$ such that $\langle w_1, v\rangle, ..., \langle w_m, v\rangle$ are distinct. Moreover, if $w_k, w_j$ are multiples to one another, then for any $v \in \partial B(0,1

Figures (4)

Figure 1: Overview and structure of this paper.
Figure 2: Illustration of example (a): how to construct the function sequence $\{f_n\}_{n=1}^\infty$ for $\rho(x) = e^x$.
Figure 3: Construction of an analytic function $\Tilde{\sigma}$ that approximates Tanh activation on an interval around 0, following Proposition \ref{['Prop function conca and S-order approx']} (a). Here we use $\sigma(x) = e^{x^2}$ and $\zeta_4$ defined as in Corollary \ref{['Cor Analytic bump function']}, with "base function" $f(x) = e^{|x|}$.
Figure 4: Construction of an analytic function $\Tilde{\sigma}$ that approximates Tanh activation globally on $\mathbb{R}$, following Proposition \ref{['Prop function conca and S-order approx']} (b). In the construction, we use $\sigma$ constructed as in Figure \ref{['Figure Good Analytic Function for Tanh']}. $\zeta$ is a scaling of an analytic bump function $\zeta_5$ defined in Corollary \ref{['Cor Analytic bump function']} with "base function" $f(x) = e^{|x|}$. Precisely, each function takes the form $\sigma(x) = \zeta_5(\alpha x) [\zeta_4(x) \tanh(x) + (1-\zeta_4(x))\tanh(x)] + (1 - \zeta_5(\alpha x)) e^{x^2}$, where $\alpha = 1.1, 1.3, 1.5, 2$ for the pink, yellow, orange, and green curves, respectively. As we can see, the approximation is almost indistinguishable from Tanh when $\alpha = 2$.

Theorems & Definitions (81)

Definition 2.1: Generalized neurons and generalized NN
Remark 2.1
Definition 2.2: fully-connected neural network
Remark 2.2
Definition 2.3: Function asymptotics
Definition 2.4: Hyper-polynomial growth
Definition 2.5: Hyper-exponential growth
Definition 2.6: Ordered growth
Lemma 3.1: dimension reduction
Proposition 3.1: functions of ordered growth are linearly independent
...and 71 more

Linear Independence of Generalized Neurons and Related Functions

Abstract

Linear Independence of Generalized Neurons and Related Functions

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (81)