Table of Contents
Fetching ...

A Divide-Align-Conquer Strategy for Program Synthesis

Jonas Witt, Sebastijan Dumančić, Tias Guns, Claus-Christian Carbon

TL;DR

The paper introduces Divide-Align-Conquer (DA&C), a hierarchical, component-based approach to program synthesis that leverages structural alignment via Structure-Mapping Theory to decompose complex tasks into tractable subproblems. By partitioning each example into meaningful components, aligning input/output parts with SME, and learning context-specific transformation rules, BEN demonstrates improved predictive accuracy over ILP baselines on string transformations and shows competitive results in the ARC visual reasoning domain. The approach offers a linear-time-like scaling in the number of partial programs, enables compact, generalizable solution programs, and provides robust ablations illustrating the value of analogical reasoning and segmentation. While limitations remain (e.g., reliance on domain priors and current primitive sets), the framework points to promising future work in neural guidance and more expressive composition of transformations for scalable, structured program synthesis.

Abstract

A major bottleneck in search-based program synthesis is the exponentially growing search space which makes learning large programs intractable. Humans mitigate this problem by leveraging the compositional nature of the real world: In structured domains, a logical specification can often be decomposed into smaller, complementary solution programs. We show that compositional segmentation can be applied in the programming by examples setting to divide the search for large programs across multiple smaller program synthesis problems. For each example, we search for a decomposition into smaller units which maximizes the reconstruction accuracy in the output under a latent task program. A structural alignment of the constituent parts in the input and output leads to pairwise correspondences used to guide the program synthesis search. In order to align the input/output structures, we make use of the Structure-Mapping Theory (SMT), a formal model of human analogical reasoning which originated in the cognitive sciences. We show that decomposition-driven program synthesis with structural alignment outperforms Inductive Logic Programming (ILP) baselines on string transformation tasks even with minimal knowledge priors. Unlike existing methods, the predictive accuracy of our agent monotonically increases for additional examples and achieves an average time complexity of $\mathcal{O}(m)$ in the number $m$ of partial programs for highly structured domains such as strings. We extend this method to the complex setting of visual reasoning in the Abstraction and Reasoning Corpus (ARC) for which ILP methods were previously infeasible.

A Divide-Align-Conquer Strategy for Program Synthesis

TL;DR

The paper introduces Divide-Align-Conquer (DA&C), a hierarchical, component-based approach to program synthesis that leverages structural alignment via Structure-Mapping Theory to decompose complex tasks into tractable subproblems. By partitioning each example into meaningful components, aligning input/output parts with SME, and learning context-specific transformation rules, BEN demonstrates improved predictive accuracy over ILP baselines on string transformations and shows competitive results in the ARC visual reasoning domain. The approach offers a linear-time-like scaling in the number of partial programs, enables compact, generalizable solution programs, and provides robust ablations illustrating the value of analogical reasoning and segmentation. While limitations remain (e.g., reliance on domain priors and current primitive sets), the framework points to promising future work in neural guidance and more expressive composition of transformations for scalable, structured program synthesis.

Abstract

A major bottleneck in search-based program synthesis is the exponentially growing search space which makes learning large programs intractable. Humans mitigate this problem by leveraging the compositional nature of the real world: In structured domains, a logical specification can often be decomposed into smaller, complementary solution programs. We show that compositional segmentation can be applied in the programming by examples setting to divide the search for large programs across multiple smaller program synthesis problems. For each example, we search for a decomposition into smaller units which maximizes the reconstruction accuracy in the output under a latent task program. A structural alignment of the constituent parts in the input and output leads to pairwise correspondences used to guide the program synthesis search. In order to align the input/output structures, we make use of the Structure-Mapping Theory (SMT), a formal model of human analogical reasoning which originated in the cognitive sciences. We show that decomposition-driven program synthesis with structural alignment outperforms Inductive Logic Programming (ILP) baselines on string transformation tasks even with minimal knowledge priors. Unlike existing methods, the predictive accuracy of our agent monotonically increases for additional examples and achieves an average time complexity of in the number of partial programs for highly structured domains such as strings. We extend this method to the complex setting of visual reasoning in the Abstraction and Reasoning Corpus (ARC) for which ILP methods were previously infeasible.
Paper Structure (33 sections, 5 equations, 12 figures, 5 tables, 3 algorithms)

This paper contains 33 sections, 5 equations, 12 figures, 5 tables, 3 algorithms.

Figures (12)

  • Figure 1: Programming by Examples (PBE) tasks from the Abstraction and Reasoning Corpus (ARC) chollet_measure_2019 (\ref{['fig:exampletask_ARC']}) and the real world string transformations data set cropper_learning_2020 (\ref{['fig:exampletask_strings']}). Agents must search for a program that transforms inputs into outputs. In \ref{['fig:exampletask_ARC']}: "Color the output light-blue whenever there is a light-blue connecting pathway between the green squares in the input." In \ref{['fig:exampletask_strings']}: "Extract all times from the meeting schedule and concatenate them using commas."
  • Figure 2: A divide, align, & conquer (DA&C) strategy yields a compact and well-generalizing program for the task in \ref{['fig:exampletask_strings']}.
  • Figure 3: Every example is decomposed into component parts (e.g. visual objects in a scene, words in a sentence). We use the information about components and their relations to produce a structural alignment between the input/output scenes. Each correspondence from this alignment is solved independently using off-the-shelf synthesis techniques. Correspondences are considered in the order in which they contribute to the structural alignment (analogy). For each unique partial program, we learn a formula which specifies its corresponding input space (the space of components on which it should be executed). The combination of an input space and its transformation is called a transformation rule. A solution program consists of a set of transformation rules that if applied to the components in the input reconstruct all components in the output.
  • Figure 4: Divide-align-conquer synthesis grammar $\mathcal{G}$.
  • Figure 5: Segmentation grammar $\mathcal{G}_{decomp}$ used for abstract visual reasoning tasks in ARC.
  • ...and 7 more figures