Table of Contents
Fetching ...

Tackling GNARLy Problems: Graph Neural Algorithmic Reasoning Reimagined through Reinforcement Learning

Alex Schutz, Victor-Alexandru Darvariu, Efimia Panagiotaki, Bruno Lacerda, Nick Hawes

TL;DR

GNARL addresses the core limitations of Neural Algorithmic Reasoning by reframing algorithm execution as a Markov Decision Process and leveraging reinforcement learning. It introduces an encode-process-act architecture with a proto-action strategy and action masking to produce valid, potentially multiple solutions without post-processing. The framework is evaluated across CLRS-30 problems (polynomial) and NP-hard tasks like MVC, TSP, and RGC, achieving high graph accuracy and competitive performance without relying on expert algorithms in some settings. This work demonstrates a general, graph-based approach to learning algorithms that can operate even when expert algorithms are unavailable, marking progress toward a unified combinatorial optimization framework.

Abstract

Neural Algorithmic Reasoning (NAR) is a paradigm that trains neural networks to execute classic algorithms by supervised learning. Despite its successes, important limitations remain: inability to construct valid solutions without post-processing and to reason about multiple correct ones, poor performance on combinatorial NP-hard problems, and inapplicability to problems for which strong algorithms are not yet known. To address these limitations, we reframe the problem of learning algorithm trajectories as a Markov Decision Process, which imposes structure on the solution construction procedure and unlocks the powerful tools of imitation and reinforcement learning (RL). We propose the GNARL framework, encompassing the methodology to translate problem formulations from NAR to RL and a learning architecture suitable for a wide range of graph-based problems. We achieve very high graph accuracy results on several CLRS-30 problems, performance matching or exceeding much narrower NAR approaches for NP-hard problems and, remarkably, applicability even when lacking an expert algorithm.

Tackling GNARLy Problems: Graph Neural Algorithmic Reasoning Reimagined through Reinforcement Learning

TL;DR

GNARL addresses the core limitations of Neural Algorithmic Reasoning by reframing algorithm execution as a Markov Decision Process and leveraging reinforcement learning. It introduces an encode-process-act architecture with a proto-action strategy and action masking to produce valid, potentially multiple solutions without post-processing. The framework is evaluated across CLRS-30 problems (polynomial) and NP-hard tasks like MVC, TSP, and RGC, achieving high graph accuracy and competitive performance without relying on expert algorithms in some settings. This work demonstrates a general, graph-based approach to learning algorithms that can operate even when expert algorithms are unavailable, marking progress toward a unified combinatorial optimization framework.

Abstract

Neural Algorithmic Reasoning (NAR) is a paradigm that trains neural networks to execute classic algorithms by supervised learning. Despite its successes, important limitations remain: inability to construct valid solutions without post-processing and to reason about multiple correct ones, poor performance on combinatorial NP-hard problems, and inapplicability to problems for which strong algorithms are not yet known. To address these limitations, we reframe the problem of learning algorithm trajectories as a Markov Decision Process, which imposes structure on the solution construction procedure and unlocks the powerful tools of imitation and reinforcement learning (RL). We propose the GNARL framework, encompassing the methodology to translate problem formulations from NAR to RL and a learning architecture suitable for a wide range of graph-based problems. We achieve very high graph accuracy results on several CLRS-30 problems, performance matching or exceeding much narrower NAR approaches for NP-hard problems and, remarkably, applicability even when lacking an expert algorithm.

Paper Structure

This paper contains 49 sections, 1 equation, 6 figures, 20 tables, 12 algorithms.

Figures (6)

  • Figure 1: A. Key correspondences leveraged by GNARL to cast NAR as an RL problem. B. Unlike standard NAR, GNARL is trainable without an expert algorithm by using a reward signal. C. Examples of the MDP $\colorbox{env}{$\mathcal{M}$} = \langle \colorbox{state}{$S$}, \colorbox{action}{$\mathcal{A}$}, \colorbox{transition}{$\mathcal{T}$}, \colorbox{reward}{$R$}, \colorbox{horizon}{$h$} \rangle$ for a polytime solvable and NP-hard problem. At each step, a node is selected, the transition function yields the next state, and a reward is obtained.
  • Figure 2: Unique solutions in 100 runs found on a single graph, $|V|=64$, using sampled actions (10 seeds).
  • Figure 3: Performance on RGC for different graph sizes ($\uparrow$).
  • Figure 4: Architecture of the GNARL framework. Each state and input feature is encoded separately, and aggregated by feature location. The processor embeds the state features into a proto-action space. The actor calculates node probabilities using the similarity of the graph embedding to the proto-action vectors. The critic uses the graph embedding to estimate the state value.
  • Figure 5: The proto-action is generated from a linear transformation of the graph embedding. This proto-action is then compared to each node embedding using Euclidean distance, which is negated and passed through a softmax function to produce the action distribution.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1: Travelling Salesperson Problem
  • Definition 2: Minimum Vertex Cover
  • Definition 3: Robust Graph Construction