On the Hardness of Learning One Hidden Layer Neural Networks
Shuchen Li, Ilias Zadik, Manolis Zampetakis
TL;DR
This work proves that learning the class of polynomial-width one-hidden-layer ReLU networks under Gaussian input and polynomially small Gaussian noise is computationally hard under standard cryptographic assumptions. The authors construct a chain of reductions: CLWE hardness implies hardness for learning Lipschitz periodic neurons under Gaussian noise, which is then shown to be equivalent to learning certain one-hidden-layer networks on a capped interval; combining with reductions from GapSVP to CLWE yields a poly-time quantum hardness result for GapSVP, and thus for the learning problem. The main result shows that any polynomial-time $\varepsilon$-weak learner with $\varepsilon=1/\poly(d)$ for width $k=\omega(\sqrt{d\log d})$ networks would imply a polynomial-time quantum algorithm for GapSVP within polynomial factors, effectively ruling out efficient learning in this regime unless lattice problems become tractable. The paper also extends the hardness to super-polynomially small noise levels, connecting robustness to cryptographic reductions via LWE/CLWE frameworks and highlighting significant implications for the computational landscape of neural-network learning with Gaussian inputs.
Abstract
In this work, we consider the problem of learning one hidden layer ReLU neural networks with inputs from $\mathbb{R}^d$. We show that this learning problem is hard under standard cryptographic assumptions even when: (1) the size of the neural network is polynomial in $d$, (2) its input distribution is a standard Gaussian, and (3) the noise is Gaussian and polynomially small in $d$. Our hardness result is based on the hardness of the Continuous Learning with Errors (CLWE) problem, and in particular, is based on the largely believed worst-case hardness of approximately solving the shortest vector problem up to a multiplicative polynomial factor.
