Table of Contents
Fetching ...

NSA: Neuro-symbolic ARC Challenge

Paweł Batorski, Jannik Brinkmann, Paul Swoboda

TL;DR

This work tackles the Abstraction and Reasoning Corpus (ARC) by proposing NSA, a neuro-symbolic pipeline that combines a transformer-based proposal generator with a symbolic, graph-based DSL search (ARGA/ARGAe). The DSL encodes transformations as graph abstractions with filters and primitives, while the transformer suggests the right primitives and their order, with test-time adaptation via synthetic task generation to tailor the proposals to each task. Pre-training on a large corpus of synthetic ARC-like tasks and subsequent fine-tuning during inference enable efficient search within ARC's 30-minute per-task limit, leading to state-of-the-art performance on the ARC evaluation set (27% improvement over baselines). The results demonstrate the value of integrating learned proposal generation with structured symbolic search for abstract visual reasoning, and point to scalable directions in extending the DSL and tuning adaptation strategies under compute constraints.

Abstract

The Abstraction and Reasoning Corpus (ARC) evaluates general reasoning capabilities that are difficult for both machine learning models and combinatorial search methods. We propose a neuro-symbolic approach that combines a transformer for proposal generation with combinatorial search using a domain-specific language. The transformer narrows the search space by proposing promising search directions, which allows the combinatorial search to find the actual solution in short time. We pre-train the trainsformer with synthetically generated data. During test-time we generate additional task-specific training tasks and fine-tune our model. Our results surpass comparable state of the art on the ARC evaluation set by 27% and compare favourably on the ARC train set. We make our code and dataset publicly available at https://github.com/Batorskq/NSA.

NSA: Neuro-symbolic ARC Challenge

TL;DR

This work tackles the Abstraction and Reasoning Corpus (ARC) by proposing NSA, a neuro-symbolic pipeline that combines a transformer-based proposal generator with a symbolic, graph-based DSL search (ARGA/ARGAe). The DSL encodes transformations as graph abstractions with filters and primitives, while the transformer suggests the right primitives and their order, with test-time adaptation via synthetic task generation to tailor the proposals to each task. Pre-training on a large corpus of synthetic ARC-like tasks and subsequent fine-tuning during inference enable efficient search within ARC's 30-minute per-task limit, leading to state-of-the-art performance on the ARC evaluation set (27% improvement over baselines). The results demonstrate the value of integrating learned proposal generation with structured symbolic search for abstract visual reasoning, and point to scalable directions in extending the DSL and tuning adaptation strategies under compute constraints.

Abstract

The Abstraction and Reasoning Corpus (ARC) evaluates general reasoning capabilities that are difficult for both machine learning models and combinatorial search methods. We propose a neuro-symbolic approach that combines a transformer for proposal generation with combinatorial search using a domain-specific language. The transformer narrows the search space by proposing promising search directions, which allows the combinatorial search to find the actual solution in short time. We pre-train the trainsformer with synthetically generated data. During test-time we generate additional task-specific training tasks and fine-tune our model. Our results surpass comparable state of the art on the ARC evaluation set by 27% and compare favourably on the ARC train set. We make our code and dataset publicly available at https://github.com/Batorskq/NSA.
Paper Structure (25 sections, 7 figures, 2 tables)

This paper contains 25 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: An overview of the NSA framework: Starting with input-output pairs, a transformer model is utilized to propose potential transformation primitives. These are then feed to the symbolic combinatorial search (ARGA xu2023graphs), which identifies the correct overall transformation and corresponding parameters solving the task for the given training input-output pairs. Finally, the selected transformation is applied to the test input to generate the final prediction.
  • Figure 2: Input-output pairs and a test image from selected ARC tasks, each with a distinct transformation challenge. Our solution is illustrated for each task. ARGA cannot solve any of the first three tasks since it cannot represent the needed transformation. Our extended DSL ARGAe has the representational capacity but can run into time-out during search and thus cannot find it effectively in practice. For the bottom right task ARGA can represent the transformation theoretically but neither ARGA nor ARGAe can find it due to a time-out during search. Necessary transformation primitives absent in ARGA’s DSL are marked in red: extract, duplicate, and upscale_grid. In contrast, transformations present in ARGA's original DSL, like update_color, are highlighted in blue. For the tasks in the upper right and lower left corner all necessary transformation primitives are missing in ARGA, while for the upper left one transformation primitive is present and another is lacking. NSA addresses this dilemma by proposing the necessary transformation primitives to explore during combinatorial search.
  • Figure 3: Two distinct abstractions of the same grid. The upper abstraction groups adjacent pixels of the same color as a single node, while the bottom abstraction groups vertically adjacent components as a single node. Each abstraction also uses a different method for associating nodes within the graph. The image includes the filter, filter parameters, transformation, and transformation parameters needed to achieve the desired output. Note that the choice of abstraction is crucial, since the bottom abstraction will not lead to a correct solution.
  • Figure 4: Hindsight relabeling example: Starting with the original input, we sample abstractions, filters, filter parameters, transformations and transformation parameters. We apply them to the grid, generating new input-output pairs with the corresponding transformation. The labeled pair is added to the transformer's training set.
  • Figure 5: Comparison of the execution times of NSA's three components: Data Generation, Fine-Tuning, and Task Solving. We averaged the results across all subsets of tasks we solved.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Remark