Table of Contents
Fetching ...

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si

TL;DR

Neuro-Symbolic Graph Generative Modeling (NSGGM) is introduced, a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees.

Abstract

We challenge black-box purely deep neural approaches for molecules and graph generation, which are limited in controllability and lack formal guarantees. We introduce Neuro-Symbolic Graph Generative Modeling (NSGGM), a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly. An autoregressive neural model proposes scaffolds and refines interaction signals, and a CPU-efficient SMT solver constructs full graphs while enforcing chemical validity, structural rules, and user-specific constraints, yielding molecules that are correct by construction and interpretable control that pure neural methods cannot provide. NSGGM delivers strong performance on both unconstrained generation and constrained generation tasks, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees. To evaluate more nuanced controllability, we also introduce a Logical-Constraint Molecular Benchmark, designed to test strict hard-rule satisfaction in workflows that require explicit, interpretable specifications together with verifiable compliance.

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

TL;DR

Neuro-Symbolic Graph Generative Modeling (NSGGM) is introduced, a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees.

Abstract

We challenge black-box purely deep neural approaches for molecules and graph generation, which are limited in controllability and lack formal guarantees. We introduce Neuro-Symbolic Graph Generative Modeling (NSGGM), a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly. An autoregressive neural model proposes scaffolds and refines interaction signals, and a CPU-efficient SMT solver constructs full graphs while enforcing chemical validity, structural rules, and user-specific constraints, yielding molecules that are correct by construction and interpretable control that pure neural methods cannot provide. NSGGM delivers strong performance on both unconstrained generation and constrained generation tasks, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees. To evaluate more nuanced controllability, we also introduce a Logical-Constraint Molecular Benchmark, designed to test strict hard-rule satisfaction in workflows that require explicit, interpretable specifications together with verifiable compliance.
Paper Structure (70 sections, 4 theorems, 42 equations, 6 figures, 7 tables, 2 algorithms)

This paper contains 70 sections, 4 theorems, 42 equations, 6 figures, 7 tables, 2 algorithms.

Key Result

Proposition 2.1

Let $G=(V,E)$ be any finite simple graph and let $\mathcal{C}(G)=\{c_1,\dots,c_p\}$ be a minimum cycle basis with edge sets $E(c_j)$. Define Let $\mathcal{P}(G)=\mathcal{C}(G)\cup \mathrm{Components}(G_A)$ and enumerate $\mathcal{P}(G)=\{P_i=(V_i,E_i)\}_{i=1}^m$. Then:

Figures (6)

  • Figure 1: An overview of the NSGGM framework.
  • Figure 2: Sample efficiency under logical constraints: cumulative number of constraint-satisfying molecules (up to 1000) versus the number of generation attempts. Colours denote model type and markers/linestyles denote constraints $\varphi_1$--$\varphi_{UNSAT}$ (with $\varphi_{UNSAT}$ unsatisfiable by construction). We only plot first 20K attempts for readability as curves beyond 20K are flat.
  • Figure 3: Example generated molecule satisfying $\varphi_3$
  • Figure 4: Example outputs from QM9 implicit. Samples were randomly chosen.
  • Figure 5: Example outputs from MOSES. Samples were randomly chosen.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Proposition 2.1: Legitimacy of Structural Partitioning
  • proof
  • Corollary 2.2: Canonical Witness for Reassembly
  • proof
  • Theorem 2.3: Soundness of Assembly Under $\phi_{\text{hard}}$
  • proof
  • Corollary 2.4: Correctness by Construction