Table of Contents
Fetching ...

Controllable Graph Generation with Diffusion Models via Inference-Time Tree Search Guidance

Jiachi Zhao, Zehong Wang, Yamei Liao, Chuxu Zhang, Yanfang Ye

TL;DR

TreeDiff addresses the challenge of controllable graph generation with diffusion models by introducing an inference-time, planning-based framework that uses Monte Carlo Tree Search to navigate long denoising trajectories. It combines three innovations—macro-step expansion to reduce search depth, dual-space denoising to couple latent efficiency with graph-level fidelity, and a dual-space verifier to provide rollout-free long-horizon value estimates. Empirical results on 2D and 3D molecular generation show state-of-the-art performance in both conditional and unconditional settings, with robust inference-time scaling and improved stability over existing inference-time guidance methods. This approach enables multi-objective, design-driven graph generation without retraining, with clear benefits for drug discovery and materials science where structured validity and property alignment are crucial.

Abstract

Graph generation is a fundamental problem in graph learning with broad applications across Web-scale systems, knowledge graphs, and scientific domains such as drug and material discovery. Recent approaches leverage diffusion models for step-by-step generation, yet unconditional diffusion offers little control over desired properties, often leading to unstable quality and difficulty in incorporating new objectives. Inference-time guidance methods mitigate these issues by adjusting the sampling process without retraining, but they remain inherently local, heuristic, and limited in controllability. To overcome these limitations, we propose TreeDiff, a Monte Carlo Tree Search (MCTS) guided dual-space diffusion framework for controllable graph generation. TreeDiff is a plug-and-play inference-time method that expands the search space while keeping computation tractable. Specifically, TreeDiff introduces three key designs to make it practical and scalable: (1) a macro-step expansion strategy that groups multiple denoising updates into a single transition, reducing tree depth and enabling long-horizon exploration; (2) a dual-space denoising mechanism that couples efficient latent-space denoising with lightweight discrete correction in graph space, ensuring both scalability and structural fidelity; and (3) a dual-space verifier that predicts long-term rewards from partially denoised graphs, enabling early value estimation and removing the need for full rollouts. Extensive experiments on 2D and 3D molecular generation benchmarks, under both unconditional and conditional settings, demonstrate that TreeDiff achieves state-of-the-art performance. Notably, TreeDiff exhibits favorable inference-time scaling: it continues to improve with additional computation, while existing inference-time methods plateau early under limited resources.

Controllable Graph Generation with Diffusion Models via Inference-Time Tree Search Guidance

TL;DR

TreeDiff addresses the challenge of controllable graph generation with diffusion models by introducing an inference-time, planning-based framework that uses Monte Carlo Tree Search to navigate long denoising trajectories. It combines three innovations—macro-step expansion to reduce search depth, dual-space denoising to couple latent efficiency with graph-level fidelity, and a dual-space verifier to provide rollout-free long-horizon value estimates. Empirical results on 2D and 3D molecular generation show state-of-the-art performance in both conditional and unconditional settings, with robust inference-time scaling and improved stability over existing inference-time guidance methods. This approach enables multi-objective, design-driven graph generation without retraining, with clear benefits for drug discovery and materials science where structured validity and property alignment are crucial.

Abstract

Graph generation is a fundamental problem in graph learning with broad applications across Web-scale systems, knowledge graphs, and scientific domains such as drug and material discovery. Recent approaches leverage diffusion models for step-by-step generation, yet unconditional diffusion offers little control over desired properties, often leading to unstable quality and difficulty in incorporating new objectives. Inference-time guidance methods mitigate these issues by adjusting the sampling process without retraining, but they remain inherently local, heuristic, and limited in controllability. To overcome these limitations, we propose TreeDiff, a Monte Carlo Tree Search (MCTS) guided dual-space diffusion framework for controllable graph generation. TreeDiff is a plug-and-play inference-time method that expands the search space while keeping computation tractable. Specifically, TreeDiff introduces three key designs to make it practical and scalable: (1) a macro-step expansion strategy that groups multiple denoising updates into a single transition, reducing tree depth and enabling long-horizon exploration; (2) a dual-space denoising mechanism that couples efficient latent-space denoising with lightweight discrete correction in graph space, ensuring both scalability and structural fidelity; and (3) a dual-space verifier that predicts long-term rewards from partially denoised graphs, enabling early value estimation and removing the need for full rollouts. Extensive experiments on 2D and 3D molecular generation benchmarks, under both unconditional and conditional settings, demonstrate that TreeDiff achieves state-of-the-art performance. Notably, TreeDiff exhibits favorable inference-time scaling: it continues to improve with additional computation, while existing inference-time methods plateau early under limited resources.

Paper Structure

This paper contains 17 sections, 13 equations, 4 figures, 7 tables, 1 algorithm.

Figures (4)

  • Figure 1: Inference-time scaling behavior on four generation benchmarks. TreeDiff exhibits consistent gains with additional inference computation, whereas existing approaches—including standard diffusion jo2022score, Best-of-N ma2025inference, and state-of-the-art inference-time guidance li2024derivative—saturate quickly, highlighting the scalability of our approach.
  • Figure 2: Comparison among diffusion-based graph generation approaches. (a) Standard graph diffusion iteratively denoises each step without correction. (b) Inference-time guidance introduces local corrections via classifier gradients but remains short-sighted. (c) Monte Carlo Tree Search (MCTS) enables structured lookahead over denoising trajectories, allowing for globally balanced exploration and exploitation.
  • Figure 3: TreeDiff integrates MCTS with diffusion-based graph generation. (a) TreeDiff follows the standard MCTS pipeline but introduces new designs in the Expansion and Simulation phases. (b) Macro-step expansion reduces tree depth by grouping multiple denoising updates. (c) Dual-space denoising couples latent diffusion with discrete structural correction to ensure validity. (d) Dual-space verifier predicts long-term rewards from latent and graph embeddings, enabling rollout-free evaluation.
  • Figure 4: Efficiency analysis on the MCTS tree depth vs. inference-time computation.