Table of Contents
Fetching ...

Finding path and cycle counting formulae in graphs with Deep Reinforcement Learning

Jason Piquenot, Maxime Bérar, Pierre Héroux, Jean-Yves Ramel, Romain Raveaux, Sébastien Adam

TL;DR

This work tackles the problem of discovering efficient matrix-based formulae for counting graph substructures (paths and cycles) by learning within a Context-Free Grammar constrained space. It introduces Grammar Reinforcement Learning (GRL), a deep RL approach that uses Monte Carlo Tree Search over a CFG, implemented via Gramformer, a transformer model that emulates a PushDown Automaton. GRL recovers known Voropaev-style formulae and, crucially, discovers novel, more efficient expressions for path counts up to length six, achieving speedups up to 6.25x. It also adapts the framework to edge-level and directed-graph counting and outlines directions to extend beyond length six by using more expressive k-WL CFGs, with potential impact on scalable graph analytics and interpretability of substructure counting.

Abstract

This paper presents Grammar Reinforcement Learning (GRL), a reinforcement learning algorithm that uses Monte Carlo Tree Search (MCTS) and a transformer architecture that models a Pushdown Automaton (PDA) within a context-free grammar (CFG) framework. Taking as use case the problem of efficiently counting paths and cycles in graphs, a key challenge in network analysis, computer science, biology, and social sciences, GRL discovers new matrix-based formulas for path/cycle counting that improve computational efficiency by factors of two to six w.r.t state-of-the-art approaches. Our contributions include: (i) a framework for generating gramformers that operate within a CFG, (ii) the development of GRL for optimizing formulas within grammatical structures, and (iii) the discovery of novel formulas for graph substructure counting, leading to significant computational improvements.

Finding path and cycle counting formulae in graphs with Deep Reinforcement Learning

TL;DR

This work tackles the problem of discovering efficient matrix-based formulae for counting graph substructures (paths and cycles) by learning within a Context-Free Grammar constrained space. It introduces Grammar Reinforcement Learning (GRL), a deep RL approach that uses Monte Carlo Tree Search over a CFG, implemented via Gramformer, a transformer model that emulates a PushDown Automaton. GRL recovers known Voropaev-style formulae and, crucially, discovers novel, more efficient expressions for path counts up to length six, achieving speedups up to 6.25x. It also adapts the framework to edge-level and directed-graph counting and outlines directions to extend beyond length six by using more expressive k-WL CFGs, with potential impact on scalable graph analytics and interpretability of substructure counting.

Abstract

This paper presents Grammar Reinforcement Learning (GRL), a reinforcement learning algorithm that uses Monte Carlo Tree Search (MCTS) and a transformer architecture that models a Pushdown Automaton (PDA) within a context-free grammar (CFG) framework. Taking as use case the problem of efficiently counting paths and cycles in graphs, a key challenge in network analysis, computer science, biology, and social sciences, GRL discovers new matrix-based formulas for path/cycle counting that improve computational efficiency by factors of two to six w.r.t state-of-the-art approaches. Our contributions include: (i) a framework for generating gramformers that operate within a CFG, (ii) the development of GRL for optimizing formulas within grammatical structures, and (iii) the discovery of novel formulas for graph substructure counting, leading to significant computational improvements.
Paper Structure (25 sections, 13 theorems, 71 equations, 14 figures, 13 algorithms)

This paper contains 25 sections, 13 theorems, 71 equations, 14 figures, 13 algorithms.

Key Result

Theorem 3.1

$G_3$ is as expressive as $3\text{-WL}$

Figures (14)

  • Figure 1: The left diagram illustrates a path in the derivation tree of the PDA $D_3$ which generates the sentence $J\odot A^2 \in L(G_3)$. The right diagram details the process of generating this sentence, emphasizing the transcription and transposition loops. As depicted, the stack fills during transposition steps and empties during transcription steps, eventually leading to the derivation of a sentence from the language.
  • Figure 2: From left to right: The agent selects a set of $N$ sentences based on an MCTS heuristic. These sentences are computed for a given set of graphs. The computation is then evaluated against a ground truth, yielding a linear combination of the sentences and a value representing their pertinence. This value is subsequently backpropagated through the MCTS search tree.
  • Figure 3: In the acting phase, rules are selected based on both the MCTS algorithm and the neural network outputs. Each time MCTS selects a node, the decision, empirical policy, and value of the node are stored in a replay buffer. During the learning phase, the neural network is updated by predicting the policy and value functions based on the decisions stored in the replay buffer.
  • Figure 4: From PDA to grammar tokens: $D_3$ is turned into three sets of tokens. The corresponding variables of each element of $\delta_r$ are turned into variable tokens. For each variable token, a set of rule tokens is defined. Eventually, for every corresponding terminal symbols of $\delta_w$ a terminal token is defined. In the end, for each variable token, a variable mask is defined.
  • Figure 5: The input is read until the first variable token (Rd). This token is passed to the encoder (Enc). The decoder (Dec) receives the encoder output and the input. The first output of the decoder is combined with the mask corresponding to variable token to generate a policy. The second output is the value.
  • ...and 9 more figures

Theorems & Definitions (24)

  • Definition 2.1: Context-Free Grammar
  • Definition 2.2: Derivation
  • Definition 2.3: Context-Free Language
  • Definition 2.4: PushDown Automaton
  • Theorem 3.1: $3\text{-WL}$ CFG
  • Theorem 5.1: Efficient path counting
  • Proposition A.1
  • proof
  • Theorem A.1: $3\text{-WL}$ CFG
  • proof
  • ...and 14 more