Table of Contents
Fetching ...

RepV: Safety-Separable Latent Spaces for Scalable Neurosymbolic Plan Verification

Yunhao Yang, Neel P. Bhatt, Pranay Samineni, Rohan Siva, Zhanyang Wang, Ufuk Topcu

TL;DR

RepV is introduced, a neurosymbolic verifier that unifies both views by learning a latent space where safe and unsafe plans are linearly separable, and provides a probabilistic guarantee on the likelihood of correct verification based on its position in the latent space.

Abstract

As AI systems migrate to safety-critical domains, verifying that their actions comply with well-defined rules remains a challenge. Formal methods provide provable guarantees but demand hand-crafted temporal-logic specifications, offering limited expressiveness and accessibility. Deep learning approaches enable evaluation of plans against natural-language constraints, yet their opaque decision process invites misclassifications with potentially severe consequences. We introduce RepV, a neurosymbolic verifier that unifies both views by learning a latent space where safe and unsafe plans are linearly separable. Starting from a modest seed set of plans labeled by an off-the-shelf model checker, RepV trains a lightweight projector that embeds each plan, together with a language model-generated rationale, into a low-dimensional space; a frozen linear boundary then verifies compliance for unseen natural-language rules in a single forward pass. Beyond binary classification, RepV provides a probabilistic guarantee on the likelihood of correct verification based on its position in the latent space. This guarantee enables a guarantee-driven refinement of the planner, improving rule compliance without human annotations. Empirical evaluations show that RepV improves compliance prediction accuracy by up to 15% compared to baseline methods while adding fewer than 0.2M parameters. Furthermore, our refinement framework outperforms ordinary fine-tuning baselines across various planning domains. These results show that safety-separable latent spaces offer a scalable, plug-and-play primitive for reliable neurosymbolic plan verification. Code and data are available at: https://repv-project.github.io/.

RepV: Safety-Separable Latent Spaces for Scalable Neurosymbolic Plan Verification

TL;DR

RepV is introduced, a neurosymbolic verifier that unifies both views by learning a latent space where safe and unsafe plans are linearly separable, and provides a probabilistic guarantee on the likelihood of correct verification based on its position in the latent space.

Abstract

As AI systems migrate to safety-critical domains, verifying that their actions comply with well-defined rules remains a challenge. Formal methods provide provable guarantees but demand hand-crafted temporal-logic specifications, offering limited expressiveness and accessibility. Deep learning approaches enable evaluation of plans against natural-language constraints, yet their opaque decision process invites misclassifications with potentially severe consequences. We introduce RepV, a neurosymbolic verifier that unifies both views by learning a latent space where safe and unsafe plans are linearly separable. Starting from a modest seed set of plans labeled by an off-the-shelf model checker, RepV trains a lightweight projector that embeds each plan, together with a language model-generated rationale, into a low-dimensional space; a frozen linear boundary then verifies compliance for unseen natural-language rules in a single forward pass. Beyond binary classification, RepV provides a probabilistic guarantee on the likelihood of correct verification based on its position in the latent space. This guarantee enables a guarantee-driven refinement of the planner, improving rule compliance without human annotations. Empirical evaluations show that RepV improves compliance prediction accuracy by up to 15% compared to baseline methods while adding fewer than 0.2M parameters. Furthermore, our refinement framework outperforms ordinary fine-tuning baselines across various planning domains. These results show that safety-separable latent spaces offer a scalable, plug-and-play primitive for reliable neurosymbolic plan verification. Code and data are available at: https://repv-project.github.io/.

Paper Structure

This paper contains 23 sections, 21 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: A high-level illustration of RepV. It first learns a latent space where the safe (the interpreter's prediction aligns with the model checker) and unsafe plan representations are separable. Then, it classifies the new plan's compliance against the new natural language rules based on the spatial location of the plan representation in the latent space.
  • Figure 2: Automaton corresponding to a syntax "if $\sigma$, do $\omega_1$, else $\omega_2$" (left) and "while $\sigma$, do $\omega_3$" (right).
  • Figure 3: An illustration for collection of auto-labeled training data for the projector, training the projector to construct a safety-separable latent space, and predicting plan compliance against natural language rules via the spatial location of the projector's output representation.
  • Figure 4: Overview of the probabilistic verification and refinement pipeline. Left: After training the latent space, we extract a calibration distribution $F_C$ using distances from correctly verified samples to their cluster centroids. Right: Given a new plan–rule pair, we compute the distance between its latent representation and the nearest centroid $c^*$. Then, we use $F_C$ to compute the probabilistic guarantee. The guarantee can be subsequently used as a feedback signal for upgrading the planner.
  • Figure 5: Example visual and textual input prompts to the planner.
  • ...and 8 more figures