Table of Contents
Fetching ...

Diffusion-DFL: Decision-focused Diffusion Models for Stochastic Optimization

Zihao Zhao, Christopher Yeh, Lingkai Kong, Kai Wang

TL;DR

This work addresses decision-focused learning under parameter uncertainty by introducing diffusion-based predictors to model complex, multi-modal distributions of uncertain parameters. It develops two end-to-end training approaches: a memory-intensive reparameterization method and a lightweight score-function estimator with a weighted ELBO gradient and variance-reduction, enabling scalable diffusion DFL. Across synthetic, power-scheduling, and portfolio tasks, diffusion DFL consistently outperforms two-stage and deterministic baselines, with the score-function variant achieving comparable decision quality while dramatically reducing memory usage (from around 60.75 GB to 0.13 GB). The proposed framework advances practical decision-making under uncertainty and provides open-source code to facilitate reproducibility and further research.

Abstract

Decision-focused learning (DFL) integrates predictive modeling and optimization by training predictors to optimize the downstream decision target rather than merely minimizing prediction error. To date, existing DFL methods typically rely on deterministic point predictions, which are often insufficient to capture the intrinsic stochasticity of real-world environments. To address this challenge, we propose the first diffusion-based DFL approach, which trains a diffusion model to represent the distribution of uncertain parameters and optimizes the decision by solving a stochastic optimization with samples drawn from the diffusion model. Our contributions are twofold. First, we formulate diffusion DFL using the reparameterization trick, enabling end-to-end training through diffusion. While effective, it is memory and compute-intensive due to the need to differentiate through the diffusion sampling process. Second, we propose a lightweight score function estimator that uses only several forward diffusion passes and avoids backpropagation through the sampling. This follows from our results that backpropagating through stochastic optimization can be approximated by a weighted score function formulation. We empirically show that our diffusion DFL approach consistently outperforms strong baselines in decision quality. The source code for all experiments is available at the project repository: https://github.com/GT-KOALA/Diffusion_DFL.

Diffusion-DFL: Decision-focused Diffusion Models for Stochastic Optimization

TL;DR

This work addresses decision-focused learning under parameter uncertainty by introducing diffusion-based predictors to model complex, multi-modal distributions of uncertain parameters. It develops two end-to-end training approaches: a memory-intensive reparameterization method and a lightweight score-function estimator with a weighted ELBO gradient and variance-reduction, enabling scalable diffusion DFL. Across synthetic, power-scheduling, and portfolio tasks, diffusion DFL consistently outperforms two-stage and deterministic baselines, with the score-function variant achieving comparable decision quality while dramatically reducing memory usage (from around 60.75 GB to 0.13 GB). The proposed framework advances practical decision-making under uncertainty and provides open-source code to facilitate reproducibility and further research.

Abstract

Decision-focused learning (DFL) integrates predictive modeling and optimization by training predictors to optimize the downstream decision target rather than merely minimizing prediction error. To date, existing DFL methods typically rely on deterministic point predictions, which are often insufficient to capture the intrinsic stochasticity of real-world environments. To address this challenge, we propose the first diffusion-based DFL approach, which trains a diffusion model to represent the distribution of uncertain parameters and optimizes the decision by solving a stochastic optimization with samples drawn from the diffusion model. Our contributions are twofold. First, we formulate diffusion DFL using the reparameterization trick, enabling end-to-end training through diffusion. While effective, it is memory and compute-intensive due to the need to differentiate through the diffusion sampling process. Second, we propose a lightweight score function estimator that uses only several forward diffusion passes and avoids backpropagation through the sampling. This follows from our results that backpropagating through stochastic optimization can be approximated by a weighted score function formulation. We empirically show that our diffusion DFL approach consistently outperforms strong baselines in decision quality. The source code for all experiments is available at the project repository: https://github.com/GT-KOALA/Diffusion_DFL.

Paper Structure

This paper contains 43 sections, 5 theorems, 62 equations, 11 figures, 2 tables.

Key Result

Proposition A.1

Let $T \in \mathbb{N}^+$, and suppose the reverse diffusion model defines a Gaussian distribution in Eq. eq:diffusion_reverse with fixed scalars $\sigma_t \geq 0$ and a standard normal prior $y_T \sim \mathcal{N}(0, I)$. Let $\{\epsilon_t\}_{t=0}^T$ be i.i.d. $\mathcal{N}(0, I)$. Then the model outp where $A_t := \frac{\partial \mu_\theta (y_t, t, x)}{\partial \theta}$, $J_t := \frac{\partial \mu_

Figures (11)

  • Figure 1: A comparison of deterministic vs. stochastic optimization with cost function $\exp(-yz)$, as described in \ref{['sec:exp_toy']}. (a) Each curve represents a cost function given a sample $y$. For any fixed $y$, the deterministic optimization decision lies at one of the boundaries ($z^*=0$ or $z^*=C$). (b) When averaging the cost function over many samples of $y$, the stochastic optimization decision lies in the interior of the feasible region instead of on the boundary. Thus, any deterministic optimization decision is suboptimal. (c) A probabilistic (diffusion) model captures a distribution over $Y$ that closely resembles the true bimodal distribution.
  • Figure 2: Cosine similarity between the reparameterization and score function gradient across different dimensions.
  • Figure 3: Learning curves for (a) score function with 10 and 50 samples (sf 10 and sf 50) and reparameterization (rp), (b) score function and importance-weighted score function with 10 samples.
  • Figure 4: Computation cost vs. performance trade-off for diffusion DFL training
  • Figure 5: Test regret vs. decision dimension $d$ in the stock portfolio task.
  • ...and 6 more figures

Theorems & Definitions (11)

  • Proposition A.1: Reparameterization trick in diffusion models
  • proof
  • Lemma A.2: Gradient of Reparameterization method
  • proof
  • Proposition A.3
  • proof
  • Proposition A.4
  • proof
  • Lemma A.5: Gradient of Score Function
  • proof
  • ...and 1 more