Table of Contents
Fetching ...

Searching Latent Program Spaces

Matthew V Macfarlane, Clement Bonnet

TL;DR

The paper introduces Latent Program Network (LPN), a neural architecture that learns a continuous latent space of implicit programs and performs gradient-based search at test time to adapt to new tasks. By decoupling program representation from raw IO mappings via an encoder/decoder pair and a latent optimizer, LPN combines neural scalability with the adaptability of symbolic approaches, reducing reliance on predefined DSLs. Across pattern and string-manipulation tasks and the ARC-AGI 2024 benchmark, LPN demonstrates strong test-time adaptation, superior out-of-distribution generalization, and clear advantages over in-context learning and test-time training when scalable latent search is employed. The work highlights latent-space search as an effective mechanism for compositional generalization and targeted program adaptation, while acknowledging limited training-program diversity as a key limitation and pointing toward future exploration of alternative optimization strategies and discrete program representations.

Abstract

General intelligence requires systems that acquire new skills efficiently and generalize beyond their training distributions. Although program synthesis approaches have strong generalization power, they face scaling issues due to the large combinatorial spaces that quickly render them impractical, requiring human-generated DSLs or pre-trained priors to narrow this search space. On the other hand, deep learning methods have had high successes, but they lack structured test-time adaptation and rely on heavy stochastic sampling or expensive gradient updates for fine-tuning. In this work, we propose the Latent Program Network (LPN), a novel architecture that builds in test-time search directly into neural models. LPN learns a latent space of implicit programs -- neurally mapping inputs to outputs -- through which it can search using gradients at test time. LPN combines the adaptability of symbolic approaches and the scalability of neural methods. It searches through a compact latent space at test time and bypasses the need for pre-defined domain-specific languages. On a range of programming-by-examples tasks, LPN either outperforms or matches performance compared to in-context learning and test-time training methods. Tested on the ARC-AGI benchmark, we demonstrate that LPN can both learn a compact program space and search through it at test time to adapt to novel tasks. LPN doubles its performance on out-of-distribution tasks when test-time search is switched on.

Searching Latent Program Spaces

TL;DR

The paper introduces Latent Program Network (LPN), a neural architecture that learns a continuous latent space of implicit programs and performs gradient-based search at test time to adapt to new tasks. By decoupling program representation from raw IO mappings via an encoder/decoder pair and a latent optimizer, LPN combines neural scalability with the adaptability of symbolic approaches, reducing reliance on predefined DSLs. Across pattern and string-manipulation tasks and the ARC-AGI 2024 benchmark, LPN demonstrates strong test-time adaptation, superior out-of-distribution generalization, and clear advantages over in-context learning and test-time training when scalable latent search is employed. The work highlights latent-space search as an effective mechanism for compositional generalization and targeted program adaptation, while acknowledging limited training-program diversity as a key limitation and pointing toward future exploration of alternative optimization strategies and discrete program representations.

Abstract

General intelligence requires systems that acquire new skills efficiently and generalize beyond their training distributions. Although program synthesis approaches have strong generalization power, they face scaling issues due to the large combinatorial spaces that quickly render them impractical, requiring human-generated DSLs or pre-trained priors to narrow this search space. On the other hand, deep learning methods have had high successes, but they lack structured test-time adaptation and rely on heavy stochastic sampling or expensive gradient updates for fine-tuning. In this work, we propose the Latent Program Network (LPN), a novel architecture that builds in test-time search directly into neural models. LPN learns a latent space of implicit programs -- neurally mapping inputs to outputs -- through which it can search using gradients at test time. LPN combines the adaptability of symbolic approaches and the scalability of neural methods. It searches through a compact latent space at test time and bypasses the need for pre-defined domain-specific languages. On a range of programming-by-examples tasks, LPN either outperforms or matches performance compared to in-context learning and test-time training methods. Tested on the ARC-AGI benchmark, we demonstrate that LPN can both learn a compact program space and search through it at test time to adapt to novel tasks. LPN doubles its performance on out-of-distribution tasks when test-time search is switched on.

Paper Structure

This paper contains 61 sections, 16 equations, 28 figures, 16 tables, 2 algorithms.

Figures (28)

  • Figure 1: Inference of the Latent Program Network (LPN) model. (Left): the encoder maps I/O pairs to a latent space of encoded programs. (Middle): the latent program is refined during an optimization process to best explain the given I/O pairs (figure detailed in the appendix at Figure \ref{['appendix:latent-optimization']}). (Right): the decoder executes the latent program to generate the desired output for a newly given input.
  • Figure 2: Ablation on the role of the encoder and latent optimization. Latents are initialized from the mean of the encoder latents (except for orange). Grad $N$ stands for doing latent optimization with $N$ gradient steps. Both the encoder initialization and the latent optimization matter for LPN.
  • Figure 3: Exact match accuracy (%) on the out-of-distribution (OOD) pattern task as the specification size is scaled from 1 input output pair to 19 pairs, for different inference methods.
  • Figure 4: Example of input (top row) and output (bottom row) pairs of a specification sampled from the Pattern task. Each sample is a batch of 4 pairs that share the same pattern.
  • Figure 7: (Left) A 2D pattern task with inputs containing marker points where patterns should be placed, with patterns varying for each program. (Right) The latent traversal visualizes the effect of traversing the latent space, on the predicted pattern by the decoder at marker points.
  • ...and 23 more figures