On the hardness of learning under symmetries
Bobak T. Kiani, Thien Le, Hannah Lawrence, Stefanie Jegelka, Melanie Weber
TL;DR
The paper tackles the computational hardness of learning equivariant neural networks under gradient-based optimization. By extending the correlational statistical query (CSQ) framework to invariant architectures (notably GNNs and frame-averaged CNNs) and analyzing Gaussian input distributions, it derives exponential and superpolynomial lower bounds that persist despite symmetry. It also proves NP-hardness for proper learning of GNNs and provides experiments that corroborate the hardness results. The findings suggest that symmetry alone is insufficient for efficient learnability in worst-case settings, underscoring the need for additional inductive biases or problem structure to achieve practical guarantees.
Abstract
We study the problem of learning equivariant neural networks via gradient descent. The incorporation of known symmetries ("equivariance") into neural nets has empirically improved the performance of learning pipelines, in domains ranging from biology to computer vision. However, a rich yet separate line of learning theoretic research has demonstrated that actually learning shallow, fully-connected (i.e. non-symmetric) networks has exponential complexity in the correlational statistical query (CSQ) model, a framework encompassing gradient descent. In this work, we ask: are known problem symmetries sufficient to alleviate the fundamental hardness of learning neural nets with gradient descent? We answer this question in the negative. In particular, we give lower bounds for shallow graph neural networks, convolutional networks, invariant polynomials, and frame-averaged networks for permutation subgroups, which all scale either superpolynomially or exponentially in the relevant input dimension. Therefore, in spite of the significant inductive bias imparted via symmetry, actually learning the complete classes of functions represented by equivariant neural networks via gradient descent remains hard.
