Exploring the Boundary of Diffusion-based Methods for Solving Constrained Optimization
Shutong Ding, Yimiao Zhou, Ke Hu, Xi Yao, Junchi Yan, Xiaoying Tang, Ye Shi
TL;DR
This work identifies a fundamental challenge in applying diffusion models to continuous constrained optimization: purely supervised diffusion tends to generate near-optimal but infeasible points in high dimensions. It introduces DiOpt, a two-phase diffusion framework that first uses a supervised warm-start and then engages a bootstrapped self-training regime guided by a feasibility-focused target distribution, along with a historical-solution replay mechanism to stabilize learning. The authors provide a theoretical justification for the infeasibility of purely supervised diffusion in high dimensions and show through extensive experiments (QP variants, ACOPF, and motion retargeting) that DiOpt achieves a favorable balance between feasibility and near-optimality, outperforming baselines. The approach has practical implications for real-world constrained optimization tasks in power systems, robotics, and related domains, where hard constraints and safety are critical.
Abstract
Diffusion models have achieved remarkable success in generative tasks such as image and video synthesis, and in control domains like robotics, owing to their strong generalization capabilities and proficiency in fitting complex multimodal distributions. However, their full potential in solving Continuous Constrained Optimization problems remains largely underexplored. Our work commences by investigating a two-dimensional constrained quadratic optimization problem as an illustrative example to explore the inherent challenges and issues when applying diffusion models to such optimization tasks and providing theoretical analyses for these observations. To address the identified gaps and harness diffusion models for Continuous Constrained Optimization, we build upon this analysis to propose a novel diffusion-based framework for optimization problems called DiOpt. This framework operates in two distinct phases: an initial warm-start phase, implemented via supervised learning, followed by a bootstrapping phase. This dual-phase architecture is designed to iteratively refine solutions, thereby improving the objective function while rigorously satisfying problem constraints. Finally, multiple candidate solutions are sampled, and the optimal one is selected through a screening process. We present extensive experiments detailing the training dynamics of DiOpt, its performance across a diverse set of Continuous Constrained Optimization problems, and an analysis of the impact of DiOpt's various hyperparameters.
