A Taxonomy of Loss Functions for Stochastic Optimal Control

Carles Domingo-Enrich

A Taxonomy of Loss Functions for Stochastic Optimal Control

Carles Domingo-Enrich

TL;DR

This work clarifies how deep SOC loss functions relate by grouping them into classes that share the same gradient in expectation, meaning they have the same optimization landscape but differ in gradient variance. It introduces three novel losses (Work-SOCM, Cost-SOCM, Unweighted SOCM) and provides a formal taxonomy linking existing losses to each class. The authors validate the taxonomy with simple, synthetic SOC experiments, showing that gradient variance, problem dimensionality, and cost magnitudes govern convergence speed and stability more than the exact gradient structure. The results offer a unified lens for selecting SOC losses tailored to problem features, especially for reward fine-tuning in diffusion/flow models.

Abstract

Stochastic optimal control (SOC) aims to direct the behavior of noisy systems and has widespread applications in science, engineering, and artificial intelligence. In particular, reward fine-tuning of diffusion and flow matching models and sampling from unnormalized methods can be recast as SOC problems. A recent work has introduced Adjoint Matching (Domingo-Enrich et al., 2024), a loss function for SOC problems that vastly outperforms existing loss functions in the reward fine-tuning setup. The goal of this work is to clarify the connections between all the existing (and some new) SOC loss functions. Namely, we show that SOC loss functions can be grouped into classes that share the same gradient in expectation, which means that their optimization landscape is the same; they only differ in their gradient variance. We perform simple SOC experiments to understand the strengths and weaknesses of different loss functions.

A Taxonomy of Loss Functions for Stochastic Optimal Control

TL;DR

Abstract

Paper Structure (41 sections, 11 theorems, 92 equations, 5 figures, 1 algorithm)

This paper contains 41 sections, 11 theorems, 92 equations, 5 figures, 1 algorithm.

Introduction
Related work
The stochastic optimal control problem
Loss functions for stochastic optimal control
Existing loss functions
The adjoint method
Adjoint Matching
The REINFORCE losses
The cross-entropy loss
Variance and log-variance losses
Moment loss
Stochastic optimal control matching (SOCM) loss
SOCM-Adjoint loss
New loss functions
Work-SOCM loss
...and 26 more sections

Key Result

Proposition 1

The gradients of the losses $\mathcal{L}_{\mathrm{Adj-Match}}$ and $\mathcal{L}_{\mathrm{Work-SOCM}}$ are equal in expectation, and in particular, for any $x \in \mathbb{R}^d$, $t \in T$, and $M$ fulfilling ass:M, we have that Hence, the only critical point of the loss $\mathcal{L}_{\mathrm{Work-SOCM}}$ is the optimal control $u^*$.

Figures (5)

Figure 1: Training losses for stochastic optimal control problems. Losses in blue scale to high-dimensions, while losses in red do not, as the gradient variance blows up exponentially with the dimension. By \ref{['thm:main']}, losses in the same block (there are five different blocks) are equal in expectation, i.e. taking infinite batch size would yield the same gradient update. Novel losses are underlined, and losses that admit a Sticking The Landing version are identified with the suffix (+STL).
Figure 2: Control $L^2$ error incurred by each loss function throughout training, on five different settings.
Figure 3: Control $L^2$ error incurred by the Adjoint Matching, Continuous Adjoint and Discrete Adjoint losses (with and without the Sticking The Landing trick), on five different settings.
Figure 4: Control $L^2$ error incurred by each loss function throughout training, on five different settings.
Figure 5: Control $L^2$ error incurred by each loss function throughout training, on five different settings.

Theorems & Definitions (14)

Proposition 1
Proposition 2
Proposition 3
Theorem 1: A taxonomy of SOC losses
Theorem 2: Girsanov theorem
Corollary 1: Girsanov theorem for SDEs
Theorem 3: Hamilton-Jacobi-Bellman equation
Theorem 4: Path-wise reparameterization trick, domingoenrich2023stochastic, Prop. C.3
Corollary 2: Path-wise reparameterization trick for stochastic optimal control, domingoenrich2023stochastic, Prop. 1
Theorem 5: Adjoint method for SDEs, Lemma 8 of domingoenrich2023stochastic, li2020scalablekidger2021neural
...and 4 more

A Taxonomy of Loss Functions for Stochastic Optimal Control

TL;DR

Abstract

A Taxonomy of Loss Functions for Stochastic Optimal Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (14)