A Neural Rewriting System to Solve Algorithmic Problems
Flavio Petruzzellis, Alberto Testolin, Alessandro Sperduti
TL;DR
The paper tackles the challenge of systematic, compositional generalization in neural models by targeting formula simplification through a modular Neural Rewriting System. It introduces a trio of neural components—Selector $sel$, Solver $sol$, and Combiner $com$—that mirror a classic rewriting algorithm, enabling iterative leaf-level reductions of nested formulas. Empirical results across ListOps, Arithmetic, and Algebra show that the Neural Rewriting System achieves stronger out-of-distribution generalization than both a neural data routing baseline and GPT-4 prompting, while revealing that leaf selection on long inputs is the principal bottleneck. The work provides a principled, interpretable architecture for symbolic reasoning with neural networks and identifies concrete directions (e.g., improving length generalization and selector reliability) for future progress in algorithmic generalization.
Abstract
Modern neural network architectures still struggle to learn algorithmic procedures that require to systematically apply compositional rules to solve out-of-distribution problem instances. In this work, we focus on formula simplification problems, a class of synthetic benchmarks used to study the systematic generalization capabilities of neural architectures. We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas by only relying on a minimal set of training examples. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules: the Selector, trained to identify solvable sub-expressions; the Solver, mapping sub-expressions to their values; and the Combiner, replacing sub-expressions in the original formula with the solution provided by the Solver. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies. We demonstrate that our approach achieves a higher degree of out-of-distribution generalization compared to these alternative approaches on three different types of formula simplification problems, and we discuss its limitations by analyzing its failures.
