Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

Chuqin Geng; Li Zhang; Mark Zhang; Haolin Ye; Ziyu Zhao; Xujie Si

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si

TL;DR

Neuro-Symbolic Graph Generative Modeling (NSGGM) is introduced, a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees.

Abstract

We challenge black-box purely deep neural approaches for molecules and graph generation, which are limited in controllability and lack formal guarantees. We introduce Neuro-Symbolic Graph Generative Modeling (NSGGM), a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly. An autoregressive neural model proposes scaffolds and refines interaction signals, and a CPU-efficient SMT solver constructs full graphs while enforcing chemical validity, structural rules, and user-specific constraints, yielding molecules that are correct by construction and interpretable control that pure neural methods cannot provide. NSGGM delivers strong performance on both unconstrained generation and constrained generation tasks, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees. To evaluate more nuanced controllability, we also introduce a Logical-Constraint Molecular Benchmark, designed to test strict hard-rule satisfaction in workflows that require explicit, interpretable specifications together with verifiable compliance.

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

TL;DR

Abstract

Paper Structure (70 sections, 4 theorems, 42 equations, 6 figures, 7 tables, 2 algorithms)

This paper contains 70 sections, 4 theorems, 42 equations, 6 figures, 7 tables, 2 algorithms.

Introduction
Preliminaries
Graph Generation Tasks
Constraint Satisfaction Problem Solving and SMT Solving
The NSGGM Framework
Decomposition and Vocabulary
Structural partitioning
Molecular interface characterization
Vocabulary and blueprint
Assembly Constraints for Graph Synthesis
Decision variables
Hard constraints ($\phi_{\text{hard}}$)
Neural guidance (Soft constraints ($\phi_{\text{soft}})$)
User constraints ($\phi_{\mathrm{user}}$)
Neural Proposals
...and 55 more sections

Key Result

Proposition 2.1

Let $G=(V,E)$ be any finite simple graph and let $\mathcal{C}(G)=\{c_1,\dots,c_p\}$ be a minimum cycle basis with edge sets $E(c_j)$. Define Let $\mathcal{P}(G)=\mathcal{C}(G)\cup \mathrm{Components}(G_A)$ and enumerate $\mathcal{P}(G)=\{P_i=(V_i,E_i)\}_{i=1}^m$. Then:

Figures (6)

Figure 1: An overview of the NSGGM framework.
Figure 2: Sample efficiency under logical constraints: cumulative number of constraint-satisfying molecules (up to 1000) versus the number of generation attempts. Colours denote model type and markers/linestyles denote constraints $\varphi_1$--$\varphi_{UNSAT}$ (with $\varphi_{UNSAT}$ unsatisfiable by construction). We only plot first 20K attempts for readability as curves beyond 20K are flat.
Figure 3: Example generated molecule satisfying $\varphi_3$
Figure 4: Example outputs from QM9 implicit. Samples were randomly chosen.
Figure 5: Example outputs from MOSES. Samples were randomly chosen.
...and 1 more figures

Theorems & Definitions (7)

Proposition 2.1: Legitimacy of Structural Partitioning
proof
Corollary 2.2: Canonical Witness for Reassembly
proof
Theorem 2.3: Soundness of Assembly Under $\phi_{\text{hard}}$
proof
Corollary 2.4: Correctness by Construction

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

TL;DR

Abstract

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (7)