The committee machine: Computational to statistical gaps in learning a two-layers neural network
Benjamin Aubin, Antoine Maillard, Jean Barbier, Florent Krzakala, Nicolas Macris, Lenka Zdeborová
TL;DR
This work provides a rigorous foundation for replica-based predictions in the two-layer committee machine by deriving a replica-symmetric free entropy via adaptive interpolation and linking it to Bayes-optimal generalization. It introduces an AMP algorithm with state evolution that achieves the Bayes-optimal performance over a broad parameter range, and reveals a substantial computational gap in regimes where information-theoretic generalization is possible but polynomial algorithms fail. The analysis uncovers a specialization phase transition as the number of hidden units grows, with distinct behavior for Gaussian vs binary weights and for large K, indicating a rich landscape of algorithmic hardness in multi-layer networks. Overall, the paper connects statistical physics insights to provable asymptotics and practical inference algorithms, highlighting both potential and limits of efficient learning in two-layer architectures and outlining directions for extending to deeper models.
Abstract
Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it, strongly suggesting that no efficient algorithm exists for those cases, and unveiling a large computational gap.
