Table of Contents
Fetching ...

ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning

Hosung Lee, Sejin Kim, Seungpil Lee, Sanha Hwang, Jihwan Lee, Byung-Jun Lee, Sundong Kim

TL;DR

ARCLE provides a Gymnasium-based reinforcement learning environment tailored to the Abstraction and Reasoning Challenge, enabling study of learning under ARC's large discrete action spaces and diverse task set. The authors show that a PPO-based agent, augmented with auxiliary losses and a non-factorizable policy, can learn on ARCLE tasks and that representation quality and policy structure critically affect performance. They demonstrate improvements in simplified settings and reveal limitations in continual RL under curriculum shifts, proposing future directions including Meta-RL, generative modeling with GFlowNets, and World Models to advance abstraction skills. Collectively, ARCLE offers a platform to investigate how reinforcement learning can support human-like abstract reasoning and generalization across unseen ARC tasks, with potential impacts on AI reasoning and problem-solving capabilities.

Abstract

This paper introduces ARCLE, an environment designed to facilitate reinforcement learning research on the Abstraction and Reasoning Corpus (ARC). Addressing this inductive reasoning benchmark with reinforcement learning presents these challenges: a vast action space, a hard-to-reach goal, and a variety of tasks. We demonstrate that an agent with proximal policy optimization can learn individual tasks through ARCLE. The adoption of non-factorial policies and auxiliary losses led to performance enhancements, effectively mitigating issues associated with action spaces and goal attainment. Based on these insights, we propose several research directions and motivations for using ARCLE, including MAML, GFlowNets, and World Models.

ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning

TL;DR

ARCLE provides a Gymnasium-based reinforcement learning environment tailored to the Abstraction and Reasoning Challenge, enabling study of learning under ARC's large discrete action spaces and diverse task set. The authors show that a PPO-based agent, augmented with auxiliary losses and a non-factorizable policy, can learn on ARCLE tasks and that representation quality and policy structure critically affect performance. They demonstrate improvements in simplified settings and reveal limitations in continual RL under curriculum shifts, proposing future directions including Meta-RL, generative modeling with GFlowNets, and World Models to advance abstraction skills. Collectively, ARCLE offers a platform to investigate how reinforcement learning can support human-like abstract reasoning and generalization across unseen ARC tasks, with potential impacts on AI reasoning and problem-solving capabilities.

Abstract

This paper introduces ARCLE, an environment designed to facilitate reinforcement learning research on the Abstraction and Reasoning Corpus (ARC). Addressing this inductive reasoning benchmark with reinforcement learning presents these challenges: a vast action space, a hard-to-reach goal, and a variety of tasks. We demonstrate that an agent with proximal policy optimization can learn individual tasks through ARCLE. The adoption of non-factorial policies and auxiliary losses led to performance enhancements, effectively mitigating issues associated with action spaces and goal attainment. Based on these insights, we propose several research directions and motivations for using ARCLE, including MAML, GFlowNets, and World Models.
Paper Structure (45 sections, 6 equations, 15 figures, 2 tables)

This paper contains 45 sections, 6 equations, 15 figures, 2 tables.

Figures (15)

  • Figure 1: Four different ARC tasks are presented, each requiring analysis through its provided demonstration pairs. The identified rule from this analysis must then be applied to a test input grid to produce the answer (test output) grid, which is currently blurred for demonstration purposes. The specific rules for each task are as follows: Task 1 modifies all gray grids within a row to match the color found in the far-left column of that same row. Task 2 relocates four identical cyan objects appropriately, each no larger than 2$\times$2 in size. Task 3 determines the color of the topmost line in a stack of overlapping horizontal and vertical lines, and outputs a single pixel of this color. Task 4 transforms the Test Input grid by coloring all but the pixels at the intersections of even-numbered rows and columns in blue.
  • Figure 2: Framework of ARCLE. The package consists of components: envs, actions, loaders, and wrappers.
  • Figure 3: The state transition process of ARCLE.
  • Figure 4: Every operation assigned in O2ARCEnv (version 0.2.5). The categories of operations (left), available operations (middle), and application examples of operations (right) are shown.
  • Figure 5: Performance of agents when various auxiliary losses are additionally used are shown. The experiment is repeated four times, and the shaded regions denote $95\%$ confidence intervals.
  • ...and 10 more figures