Training Neural Networks is NP-Hard in Fixed Dimension
Vincent Froese, Christoph Hertrich
TL;DR
This work analyzes the parameterized complexity of training two-layer neural networks with ReLU and linear-threshold activations, focusing on input dimension $d$, hidden width $k$, and target error $\gamma$. It proves NP-hardness at fixed dimension ($d=2$) and W[1]-hardness for four ReLUs with zero training error, extending to linear-threshold activations, while also presenting an FPT algorithm for the convex-ReLU case under $\gamma=0$ with running time $2^{O(k^2 d)}\mathrm{poly}(k,L)$. The results delineate clear boundaries between intractable and tractable regimes across $(d,k)$ and activation type, employing geometric constructs like levees and selection gadgets to encode combinatorial constraints. Collectively, they settle much of the complexity landscape for exact training in these two-layer networks and motivate future work on approximate training and broader architectures.
Abstract
We study the parameterized complexity of training two-layer neural networks with respect to the dimension of the input data and the number of hidden neurons, considering ReLU and linear threshold activation functions. Albeit the computational complexity of these problems has been studied numerous times in recent years, several questions are still open. We answer questions by Arora et al. [ICLR '18] and Khalife and Basu [IPCO '22] showing that both problems are NP-hard for two dimensions, which excludes any polynomial-time algorithm for constant dimension. We also answer a question by Froese et al. [JAIR '22] proving W[1]-hardness for four ReLUs (or two linear threshold neurons) with zero training error. Finally, in the ReLU case, we show fixed-parameter tractability for the combined parameter number of dimensions and number of ReLUs if the network is assumed to compute a convex map. Our results settle the complexity status regarding these parameters almost completely.
