A Taxonomy of Loss Functions for Stochastic Optimal Control
Carles Domingo-Enrich
TL;DR
This work clarifies how deep SOC loss functions relate by grouping them into classes that share the same gradient in expectation, meaning they have the same optimization landscape but differ in gradient variance. It introduces three novel losses (Work-SOCM, Cost-SOCM, Unweighted SOCM) and provides a formal taxonomy linking existing losses to each class. The authors validate the taxonomy with simple, synthetic SOC experiments, showing that gradient variance, problem dimensionality, and cost magnitudes govern convergence speed and stability more than the exact gradient structure. The results offer a unified lens for selecting SOC losses tailored to problem features, especially for reward fine-tuning in diffusion/flow models.
Abstract
Stochastic optimal control (SOC) aims to direct the behavior of noisy systems and has widespread applications in science, engineering, and artificial intelligence. In particular, reward fine-tuning of diffusion and flow matching models and sampling from unnormalized methods can be recast as SOC problems. A recent work has introduced Adjoint Matching (Domingo-Enrich et al., 2024), a loss function for SOC problems that vastly outperforms existing loss functions in the reward fine-tuning setup. The goal of this work is to clarify the connections between all the existing (and some new) SOC loss functions. Namely, we show that SOC loss functions can be grouped into classes that share the same gradient in expectation, which means that their optimization landscape is the same; they only differ in their gradient variance. We perform simple SOC experiments to understand the strengths and weaknesses of different loss functions.
