Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks
Artur Back de Luca, George Giapitzakis, Kimon Fountoulakis
TL;DR
This work investigates whether neural networks can learn to execute discrete binary algorithms exactly by analyzing two-layer networks in the NTK regime. It introduces a template-matching framework that encodes local bitwise rules into a block-structured training set, enabling exact algorithmic execution (permutations, binary addition, binary multiplication, and SBN instructions) with logarithmically many examples via an ensemble of infinite-width networks. The authors prove NTK-based exact learnability under controlled interference, and extend the results to high-probability guarantees using ensemble averaging, establishing ensemble size bounds that scale polynomially with bit-length for the studied tasks. The work also discusses limitations (orthogonality assumptions, bounded memory) and outlines future directions toward architectures capable of handling longer or variable-length inputs, such as RNNs, Transformers, or GNNs, while preserving the theoretical framework. Overall, the paper provides formal guarantees for exact neural execution of fundamental algorithms in a controlled NTK setting, offering insight into how discrete computations can be embedded and learned in neural systems with provable properties.
Abstract
Neural networks are known for their ability to approximate smooth functions, yet they fail to generalize perfectly to unseen inputs when trained on discrete operations. Such operations lie at the heart of algorithmic tasks such as arithmetic, which is often used as a test bed for algorithmic execution in neural networks. In this work, we ask: can neural networks learn to execute binary-encoded algorithmic instructions exactly? We use the Neural Tangent Kernel (NTK) framework to study the training dynamics of two-layer fully connected networks in the infinite-width limit and show how a sufficiently large ensemble of such models can be trained to execute exactly, with high probability, four fundamental tasks: binary permutations, binary addition, binary multiplication, and Subtract and Branch if Negative (SBN) instructions. Since SBN is Turing-complete, our framework extends to computable functions. We show how this can be efficiently achieved using only logarithmically many training data. Our approach relies on two techniques: structuring the training data to isolate bit-level rules, and controlling correlations in the NTK regime to align model predictions with the target algorithmic executions.
