Table of Contents
Fetching ...

Grammar-Constrained (CFL) Reachability: Subcubic Preprocessing, Indexing Trade-offs, and Structured Decoding Semantics

Faruk Alpay, Levent Sarioglu

TL;DR

This work presents an algorithmic framework for evaluating reachability queries constrained by context-free grammars, and analyzes its theoretical runtime bounds to provide guidance for selecting efficient approaches in practice.

Abstract

We study the problem of grammar-constrained context-free language reachability in graphs, focusing on complexity and empirical performance. We present an algorithmic framework for evaluating reachability queries constrained by context-free grammars, and analyze its theoretical runtime bounds. To complement our theoretical results, we conduct an extensive empirical evaluation on a comprehensive benchmark of real-world schemas, comparing different algorithmic variants and reporting performance trade-offs. Our results highlight the impact of grammar structure and graph characteristics on reachability computation, and provide guidance for selecting efficient approaches in practice.

Grammar-Constrained (CFL) Reachability: Subcubic Preprocessing, Indexing Trade-offs, and Structured Decoding Semantics

TL;DR

This work presents an algorithmic framework for evaluating reachability queries constrained by context-free grammars, and analyzes its theoretical runtime bounds to provide guidance for selecting efficient approaches in practice.

Abstract

We study the problem of grammar-constrained context-free language reachability in graphs, focusing on complexity and empirical performance. We present an algorithmic framework for evaluating reachability queries constrained by context-free grammars, and analyze its theoretical runtime bounds. To complement our theoretical results, we conduct an extensive empirical evaluation on a comprehensive benchmark of real-world schemas, comparing different algorithmic variants and reporting performance trade-offs. Our results highlight the impact of grammar structure and graph characteristics on reachability computation, and provide guidance for selecting efficient approaches in practice.
Paper Structure (33 sections, 10 theorems, 16 equations, 4 figures, 4 tables, 4 algorithms)

This paper contains 33 sections, 10 theorems, 16 equations, 4 figures, 4 tables, 4 algorithms.

Key Result

Theorem 2.2

After Algorithm alg:satindex, for all $A\in N$ and $u,v\in V$,

Figures (4)

  • Figure 1: Terminal-anchored inference used by LinIndex. Dashed arrows are derived relations.
  • Figure 2: Grammar-class distribution across JSONSchemaBench sub-datasets.
  • Figure 3: Schema byte-size vs. grammar class. Linear schemas cluster at smaller sizes.
  • Figure 4: Decomposition pipeline diagram.

Theorems & Definitions (39)

  • Definition 1.1: Labeled directed graph
  • Definition 1.2: CFG
  • Definition 1.3: CNF
  • Definition 1.4: Validation-gated instance and query
  • Definition 1.5: Nonterminal relations
  • Definition 2.1: Saturation index SatIndex
  • Theorem 2.2: Correctness of SatIndex
  • proof
  • Theorem 2.3: Baseline preprocessing/space/query bounds
  • proof
  • ...and 29 more