Table of Contents
Fetching ...

Improved Constrained Generation by Bridging Pretrained Generative Models

Xiaoxuan Liang, Saeid Naderiparizi, Yunpeng Liu, Berend Zwartsenberg, Frank Wood

TL;DR

This work proposes a constrained generation framework that generates samples directly within feasible regions while preserving realism, and fine-tunes a pretrained generative model to enforce constraints while maintaining generative fidelity.

Abstract

Constrained generative modeling is fundamental to applications such as robotic control and autonomous driving, where models must respect physical laws and safety-critical constraints. In real-world settings, these constraints rarely take the form of simple linear inequalities, but instead complex feasible regions that resemble road maps or other structured spatial domains. We propose a constrained generation framework that generates samples directly within such feasible regions while preserving realism. Our method fine-tunes a pretrained generative model to enforce constraints while maintaining generative fidelity. Experimentally, our method exhibits characteristics distinct from existing fine-tuning and training-free constrained baselines, revealing a new compromise between constraint satisfaction and sampling quality.

Improved Constrained Generation by Bridging Pretrained Generative Models

TL;DR

This work proposes a constrained generation framework that generates samples directly within feasible regions while preserving realism, and fine-tunes a pretrained generative model to enforce constraints while maintaining generative fidelity.

Abstract

Constrained generative modeling is fundamental to applications such as robotic control and autonomous driving, where models must respect physical laws and safety-critical constraints. In real-world settings, these constraints rarely take the form of simple linear inequalities, but instead complex feasible regions that resemble road maps or other structured spatial domains. We propose a constrained generation framework that generates samples directly within such feasible regions while preserving realism. Our method fine-tunes a pretrained generative model to enforce constraints while maintaining generative fidelity. Experimentally, our method exhibits characteristics distinct from existing fine-tuning and training-free constrained baselines, revealing a new compromise between constraint satisfaction and sampling quality.
Paper Structure (42 sections, 3 theorems, 40 equations, 9 figures, 5 tables)

This paper contains 42 sections, 3 theorems, 40 equations, 9 figures, 5 tables.

Key Result

Theorem 3.1

(Asymptotic validity of constraint gradient substitution) Assume: Then

Figures (9)

  • Figure 1: Qualitative comparison of constrained trajectory generation. We visualize samples from the same traffic intersection scene and timestep, where gray dot trajectories indicate ground-truth agent motion and orange trajectories denote model predictions. Panel \ref{['fig:banner-a']} shows an unconstrained diffusion baseline, which exhibits both offroad and collision violations while executing a left turn. Panel \ref{['fig:banner-b']} shows a training-free guided method, MPGD without projection hemanifold that reduces violations but introduces noticeable trajectory distortion, including persistent offroad behavior and an abrupt acceleration during a right-turn maneuver toward the northeast, leading to a near-collision. Panel \ref{['fig:banner-c']} shows our MBM++, which exhibits no offroad or collision violations while preserving realistic and coherent motion. All samples are generated from the same initial conditions.
  • Figure 2: Visualization of the bouncing balls task. Panel \ref{['fig:bb-vis:1']} and panel \ref{['fig:bb-vis:2']} show two consecutive steps before and after a bounce between the balls 3 and 8. In each step, each ball is represented by a 4-dimensional vector representing its location and velocity, shown by the red arrows. Panel \ref{['fig:bb_scatter-plot']} presents a Pareto comparison for diffusion-based models in terms of ELBO and infraction rate. Higher ELBO and lower infraction rate indicate better performance, with preferable methods located toward the upper-left region. Our method lies near the Pareto frontier, achieving a favorable trade off between these two metrics.
  • Figure 3: Comparisons on DR_DEU_Merging validation dataset (merging into single-lane scenario). In panel \ref{['fig:baseline-single-lane']}, the unconstrained baseline produces an offroad trajectory for the vehicle near the bottom-right, deviating beyond the right road boundary. In panel \ref{['fig:mpgd-single-lane']}, MPGD without projection applies an overly strong correction that pushes the same vehicle toward the left boundary, while the middle vehicle exhibits noticeable trajectory distortion. In contrast, panel \ref{['fig:bridge-single-land']} maintains all predicted trajectories within the single drivable lane, preserving realistic motion without violations.
  • Figure :
  • Figure :
  • ...and 4 more figures

Theorems & Definitions (5)

  • Theorem 3.1
  • proof
  • proof
  • Theorem B.1
  • Theorem B.2