Table of Contents
Fetching ...

Learning Compositional Rules via Neural Program Synthesis

Maxwell I. Nye, Armando Solar-Lezama, Joshua B. Tenenbaum, Brenden M. Lake

TL;DR

The paper tackles learning compositional rules from very few examples by framing rule induction as neural program synthesis guided by symbolic checking. It combines a neural proposer that generates candidate grammars with a symbolic interpreter and a search procedure to enforce consistency with observed examples, enabling explicit rule learning and out-of-distribution generalization. Across Miniscan, SCAN challenges, and cross-language number-word decoding, the approach achieves strong or perfect performance, often outperforming neural meta-learning baselines and traditional program synthesis methods. This neuro-symbolic framework improves generalization, interpretability, and data efficiency, suggesting a viable path toward human-like, rule-based AI. The work also lays groundwork for future joint learning of meta-grammars and interpretable hybrid systems.

Abstract

Many aspects of human reasoning, including language, require learning rules from very little data. Humans can do this, often learning systematic rules from very few examples, and combining these rules to form compositional rule-based systems. Current neural architectures, on the other hand, often fail to generalize in a compositional manner, especially when evaluated in ways that vary systematically from training. In this work, we present a neuro-symbolic model which learns entire rule systems from a small set of examples. Instead of directly predicting outputs from inputs, we train our model to induce the explicit system of rules governing a set of previously seen examples, drawing upon techniques from the neural program synthesis literature. Our rule-synthesis approach outperforms neural meta-learning techniques in three domains: an artificial instruction-learning domain used to evaluate human learning, the SCAN challenge datasets, and learning rule-based translations of number words into integers for a wide range of human languages.

Learning Compositional Rules via Neural Program Synthesis

TL;DR

The paper tackles learning compositional rules from very few examples by framing rule induction as neural program synthesis guided by symbolic checking. It combines a neural proposer that generates candidate grammars with a symbolic interpreter and a search procedure to enforce consistency with observed examples, enabling explicit rule learning and out-of-distribution generalization. Across Miniscan, SCAN challenges, and cross-language number-word decoding, the approach achieves strong or perfect performance, often outperforming neural meta-learning baselines and traditional program synthesis methods. This neuro-symbolic framework improves generalization, interpretability, and data efficiency, suggesting a viable path toward human-like, rule-based AI. The work also lays groundwork for future joint learning of meta-grammars and interpretable hybrid systems.

Abstract

Many aspects of human reasoning, including language, require learning rules from very little data. Humans can do this, often learning systematic rules from very few examples, and combining these rules to form compositional rule-based systems. Current neural architectures, on the other hand, often fail to generalize in a compositional manner, especially when evaluated in ways that vary systematically from training. In this work, we present a neuro-symbolic model which learns entire rule systems from a small set of examples. Instead of directly predicting outputs from inputs, we train our model to induce the explicit system of rules governing a set of previously seen examples, drawing upon techniques from the neural program synthesis literature. Our rule-synthesis approach outperforms neural meta-learning techniques in three domains: an artificial instruction-learning domain used to evaluate human learning, the SCAN challenge datasets, and learning rule-based translations of number words into integers for a wide range of human languages.

Paper Structure

This paper contains 26 sections, 2 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Illustration of our synthesis-based rule learner and comparison to previous work. A) Previous work lake2019compositional: Support examples are encoded into an external neural memory. A query output is predicted by conditioning on the query input sequence and interacting with the external memory via attention. B) Our model: Given a support set of input-output examples, our model produces a distribution over candidate grammars. We sample from this distribution, and symbolically check consistency of each sampled grammar against the support set until a grammar is found which satisfies the input-output examples in the support set. This approach allows much more effective search than selecting the maximum likelihood grammar from the network.
  • Figure 2: Illustration of our synthesis-based rule learner neural architecture and grammar application. Support examples are encoded via BiLSTMs. The decoder LSTM attends over the resulting vectors and decodes a grammar, which can be symbolically applied to held out query inputs. Middle: an example of a fully synthesized grammar which solves the task in Figure \ref{['fig:humanMiniscan']}.
  • Figure 3: An example of few-shot learning of instructions. In lake2019human, participants learned to execute instructions in a novel language of nonce words by producing sequences of colored circles. Human performance is shown next to each query instruction, as the percent correct across participants. When conditioned on the support set, our model can predict the correct output sequences on the held out query instructions by synthesizing the grammar in Figure \ref{['fig:architecture']}.
  • Figure 4: Miniscan generalization results. We train on random grammars with 3-4 primitives, 2-4 higher order rules, and 10-20 support examples. Left: At test time, we vary the number of higher-order rules. The synthesis-based approach using search achieves near-perfect accuracy for most test conditions. Right: Length generalization results. A key challenge for compositional learning is generalization across lengths. We plot accuracy as a function of query output length for the "4 higher-order rules" test condition. The accuracy of our synthesis approach does not degrade as a function of query output length, whereas the performance of baselines decreases.
  • Figure 5: Right: Example scan data. Each example consists of a synthetic language command (top) paired with a discrete action sequence (bottom). Fig. adapted from andreas2019good. Left: Induced grammar which solves scan.
  • ...and 5 more figures