Hardness of Learning Fixed Parities with Neural Networks
Itamar Shoshani, Ohad Shamir
TL;DR
The paper addresses why learning fixed parity functions remains hard for gradient-based methods, despite being theoretically learnable with small samples. By proving a novel exponential decay bound on Fourier coefficients for linear-threshold functions and connecting this to the gradients encountered by perturbed gradient descent, it shows that standard training of one-hidden-layer ReLU networks fails to meaningfully reduce the parity-learning objective for parity sets of size |S|, including the full parity. A complementary single-neuron result under squared loss exhibits the same hardness, tying weak learnability to algorithmic dynamics rather than expressivity. Together, these results illuminate the practical limits of gradient-based learning for parity tasks and open questions about extending the analysis to SGD, other architectures, and non-spherical weight distributions.
Abstract
Learning parity functions is a canonical problem in learning theory, which although computationally tractable, is not amenable to standard learning algorithms such as gradient-based methods. This hardness is usually explained via statistical query lower bounds [Kearns, 1998]. However, these bounds only imply that for any given algorithm, there is some worst-case parity function that will be hard to learn. Thus, they do not explain why fixed parities - say, the full parity function over all coordinates - are difficult to learn in practice, at least with standard predictors and gradient-based methods [Abbe and Boix-Adsera, 2022]. In this paper, we address this open problem, by showing that for any fixed parity of some minimal size, using it as a target function to train one-hidden-layer ReLU networks with perturbed gradient descent will fail to produce anything meaningful. To establish this, we prove a new result about the decay of the Fourier coefficients of linear threshold (or weighted majority) functions, which may be of independent interest.
