A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

Louis Claeys; Artur Goldman; Zebang Shen; Niao He

A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

Louis Claeys, Artur Goldman, Zebang Shen, Niao He

Abstract

High-dimensional stochastic optimal control (SOC) becomes harder with longer planning horizons: existing methods scale linearly in the horizon $T$, with performance often deteriorating exponentially. We overcome these limitations for a subclass of linearly-solvable SOC problems-those whose uncontrolled drift is the gradient of a potential. In this setting, the Hamilton-Jacobi-Bellman equation reduces to a linear PDE governed by an operator $\mathcal{L}$. We prove that, under the gradient drift assumption, $\mathcal{L}$ is unitarily equivalent to a Schrödinger operator $\mathcal{S} = -Δ+ \mathcal{V}$ with purely discrete spectrum, allowing the long-horizon control to be efficiently described via the eigensystem of $\mathcal{L}$. This connection provides two key results: first, for a symmetric linear-quadratic regulator (LQR), $\mathcal{S}$ matches the Hamiltonian of a quantum harmonic oscillator, whose closed-form eigensystem yields an analytic solution to the symmetric LQR with \emph{arbitrary} terminal cost. Second, in a more general setting, we learn the eigensystem of $\mathcal{L}$ using neural networks. We identify implicit reweighting issues with existing eigenfunction learning losses that degrade performance in control tasks, and propose a novel loss function to mitigate this. We evaluate our method on several long-horizon benchmarks, achieving an order-of-magnitude improvement in control accuracy compared to state-of-the-art methods, while reducing memory usage and runtime complexity from $\mathcal{O}(Td)$ to $\mathcal{O}(d)$.

A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

Abstract

High-dimensional stochastic optimal control (SOC) becomes harder with longer planning horizons: existing methods scale linearly in the horizon

, with performance often deteriorating exponentially. We overcome these limitations for a subclass of linearly-solvable SOC problems-those whose uncontrolled drift is the gradient of a potential. In this setting, the Hamilton-Jacobi-Bellman equation reduces to a linear PDE governed by an operator

. We prove that, under the gradient drift assumption,

is unitarily equivalent to a Schrödinger operator

with purely discrete spectrum, allowing the long-horizon control to be efficiently described via the eigensystem of

. This connection provides two key results: first, for a symmetric linear-quadratic regulator (LQR),

matches the Hamiltonian of a quantum harmonic oscillator, whose closed-form eigensystem yields an analytic solution to the symmetric LQR with \emph{arbitrary} terminal cost. Second, in a more general setting, we learn the eigensystem of

using neural networks. We identify implicit reweighting issues with existing eigenfunction learning losses that degrade performance in control tasks, and propose a novel loss function to mitigate this. We evaluate our method on several long-horizon benchmarks, achieving an order-of-magnitude improvement in control accuracy compared to state-of-the-art methods, while reducing memory usage and runtime complexity from

Paper Structure (70 sections, 20 theorems, 96 equations, 9 figures, 4 tables, 2 algorithms)

This paper contains 70 sections, 20 theorems, 96 equations, 9 figures, 4 tables, 2 algorithms.

Introduction
Linearly-solvable HJB.
Our approach: Reduction to Schrödinger operator.
Preliminaries
Stochastic optimal control
Hamilton-Jacobi-Bellman equation.
A linear PDE reformulation
Eigenfunction solutions
Our framework
Spectral properties of the Schrödinger operator
Eigenfunction control
Closed-form solution for the symmetric LQR
Numerical methods
Learning eigenfunctions
PINN loss
...and 55 more sections

Key Result

Theorem 1

Let $\mathcal{L}$ be an essentially self-adjointAn operator is called essentially self-adjoint if its closure is self-adjoint. See reed1980methods and reed1975ii for more details., densely defined operator on $\mathcal{H}$ which admits an orthonormal basis of eigenfunctions $(\phi_i, \lambda_i)_{i\i

Figures (9)

Figure 1: Performance degradation as time horizon $T$ increases for different methods (see \ref{['app:experiments']} for details).
Figure 2: Diminishing returns from increasing the number of eigenfunctions for an LQR in $d=20$ dimensions.
Figure 3: Learned controls (arrows) and $V_0$ for different eigenfunction losses. Existing methods fail to learn the correct control in regions where $V_0$ is large due to implicit reweighting.
Figure 4: Comparison of the different eigenfunction losses (EMA).
Figure 5: Average $L^2$ control error (EMA) as a function of iteration (top row) and $L^2$ error as a function of $t\in[0,T]$ (bottom row).
...and 4 more figures

Theorems & Definitions (28)

Definition 1
Theorem 1: Restatement of Theorem VIII.7 in reed1980methods
Theorem 2: Restatement of reed1978iv, Theorem XIII.67, XIII.64, XIII.47
Remark 1
Theorem 3
Theorem 4
Remark 2
Definition 2
Lemma 1
Theorem 5
...and 18 more

A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

Abstract

A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

Authors

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (28)