Markov Kernels, Distances and Optimal Control: A Parable of Linear Quadratic Non-Gaussian Distribution Steering

Alexis M. H. Teter; Wenqing Wang; Sachin Shivakumar; Abhishek Halder

Markov Kernels, Distances and Optimal Control: A Parable of Linear Quadratic Non-Gaussian Distribution Steering

Alexis M. H. Teter, Wenqing Wang, Sachin Shivakumar, Abhishek Halder

TL;DR

The paper derives a Markov kernel κ for a time-varying Itô diffusion with a quadratic killing rate, showing κ is the Green's function of a corresponding reaction-advection-diffusion PDE and depends on a Riccati ODE. It proposes a distance-from-deterministic-optimal-control as the organizing principle behind κ, enabling a unified treatment of Markov kernels, distances, and control. The main result gives a closed-form kernel for the LQ non-Gaussian Schrödinger bridge, enabling exact solution by dynamic Sinkhorn recursions for generic non-Gaussian endpoints with finite second moments. This framework subsumes previous kernels (heat and linear) and extends them to general $(oldsymbol{A}_t,oldsymbol{B}_t)$ with quadratic killing, offering a principled path to non-Gaussian distribution steering in stochastic control. The approach highlights structural connections that may be useful beyond the immediate LQ Schrödinger bridge problem.

Abstract

For a controllable linear time-varying (LTV) pair $(\boldsymbol{A}_t,\boldsymbol{B}_t)$ and $\boldsymbol{Q}_{t}$ positive semidefinite, we derive the Markov kernel for the Itô diffusion ${\mathrm{d}}\boldsymbol{x}_{t}=\boldsymbol{A}_{t}\boldsymbol{x}_t {\mathrm{d}} t + \sqrt{2}\boldsymbol{B}_{t}{\mathrm{d}}\boldsymbol{w}_{t}$ with an accompanying killing of probability mass at rate $\frac{1}{2}\boldsymbol{x}^{\top}\boldsymbol{Q}_{t}\boldsymbol{x}$. This Markov kernel is the Green's function for an associated linear reaction-advection-diffusion partial differential equation. Our result generalizes the recently derived kernel for the special case $\left(\boldsymbol{A}_t,\boldsymbol{B}_t\right)=\left(\boldsymbol{0},\boldsymbol{I}\right)$, and depends on the solution of an associated Riccati matrix ODE. A consequence of this result is that the linear quadratic non-Gaussian Schrödinger bridge is exactly solvable. This means that the problem of steering a controlled LTV diffusion from a given non-Gaussian distribution to another over a fixed deadline while minimizing an expected quadratic cost can be solved using dynamic Sinkhorn recursions performed with the derived kernel. Our derivation for the $\left(\boldsymbol{A}_t,\boldsymbol{B}_t,\boldsymbol{Q}_t\right)$-parametrized kernel pursues a new idea that relies on finding a state-time dependent distance-like functional given by the solution of a deterministic optimal control problem. This technique breaks away from existing methods, such as generalizing Hermite polynomials or Weyl calculus, which have seen limited success in the reaction-diffusion context. Our technique uncovers a new connection between Markov kernels, distances, and optimal control. This connection is of interest beyond its immediate application in solving the linear quadratic Schrödinger bridge problem.

Markov Kernels, Distances and Optimal Control: A Parable of Linear Quadratic Non-Gaussian Distribution Steering

TL;DR

with quadratic killing, offering a principled path to non-Gaussian distribution steering in stochastic control. The approach highlights structural connections that may be useful beyond the immediate LQ Schrödinger bridge problem.

Abstract

For a controllable linear time-varying (LTV) pair

and

positive semidefinite, we derive the Markov kernel for the Itô diffusion

with an accompanying killing of probability mass at rate

. This Markov kernel is the Green's function for an associated linear reaction-advection-diffusion partial differential equation. Our result generalizes the recently derived kernel for the special case

, and depends on the solution of an associated Riccati matrix ODE. A consequence of this result is that the linear quadratic non-Gaussian Schrödinger bridge is exactly solvable. This means that the problem of steering a controlled LTV diffusion from a given non-Gaussian distribution to another over a fixed deadline while minimizing an expected quadratic cost can be solved using dynamic Sinkhorn recursions performed with the derived kernel. Our derivation for the

-parametrized kernel pursues a new idea that relies on finding a state-time dependent distance-like functional given by the solution of a deterministic optimal control problem. This technique breaks away from existing methods, such as generalizing Hermite polynomials or Weyl calculus, which have seen limited success in the reaction-diffusion context. Our technique uncovers a new connection between Markov kernels, distances, and optimal control. This connection is of interest beyond its immediate application in solving the linear quadratic Schrödinger bridge problem.

Markov Kernels, Distances and Optimal Control: A Parable of Linear Quadratic Non-Gaussian Distribution Steering

TL;DR

Abstract

Markov Kernels, Distances and Optimal Control: A Parable of Linear Quadratic Non-Gaussian Distribution Steering

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (11)