Table of Contents
Fetching ...

Generative Modelling with Tensor Train approximations of Hamilton--Jacobi--Bellman equations

David Sommer, Robert Gruhlke, Max Kirstein, Martin Eigel, Claudia Schillings

TL;DR

This work addresses the problem of sampling from a target density with unknown normalization by leveraging reverse-time diffusion and a Hamilton-Jacobi-Bellman (HJB) formulation. The authors propose a direct, time-integrated solver that operates on compressed polynomial bases encoded as Functional Tensor Trains (FTT) and Tensor Trains (TT), enabling sample-free, normalization-agnostic estimation of the log-density via a shifted HJB equation with $v_0=\Phi$. The method provides explicit TT representations for the linear and nonlinear HJB operators, along with projection and retraction steps to bound polynomial degree and TT rank, and it uses time-adaptive Euler steps (with stiffness and error criteria) or dynamical low-rank integration to solve the tensor-valued ODE. Numerical experiments in 20 dimensions demonstrate stable rank behavior, controlled errors, and accurate reverse-time sampling for Gaussian and mixed nonlinear densities, highlighting the potential for scalable Bayesian inference without requiring normalization constants. The results suggest that the TT-based HJB solver can serve as a principled, interpretable alternative to black-box neural approaches for high-dimensional generative modeling and Bayesian sampling, with avenues for future improvements via advanced DLRA techniques and diffusion-accelerated samplers.

Abstract

Sampling from probability densities is a common challenge in fields such as Uncertainty Quantification (UQ) and Generative Modelling (GM). In GM in particular, the use of reverse-time diffusion processes depending on the log-densities of Ornstein-Uhlenbeck forward processes are a popular sampling tool. In Berner et al. [2022] the authors point out that these log-densities can be obtained by solution of a \textit{Hamilton-Jacobi-Bellman} (HJB) equation known from stochastic optimal control. While this HJB equation is usually treated with indirect methods such as policy iteration and unsupervised training of black-box architectures like Neural Networks, we propose instead to solve the HJB equation by direct time integration, using compressed polynomials represented in the Tensor Train (TT) format for spatial discretization. Crucially, this method is sample-free, agnostic to normalization constants and can avoid the curse of dimensionality due to the TT compression. We provide a complete derivation of the HJB equation's action on Tensor Train polynomials and demonstrate the performance of the proposed time-step-, rank- and degree-adaptive integration method on a nonlinear sampling task in 20 dimensions.

Generative Modelling with Tensor Train approximations of Hamilton--Jacobi--Bellman equations

TL;DR

This work addresses the problem of sampling from a target density with unknown normalization by leveraging reverse-time diffusion and a Hamilton-Jacobi-Bellman (HJB) formulation. The authors propose a direct, time-integrated solver that operates on compressed polynomial bases encoded as Functional Tensor Trains (FTT) and Tensor Trains (TT), enabling sample-free, normalization-agnostic estimation of the log-density via a shifted HJB equation with . The method provides explicit TT representations for the linear and nonlinear HJB operators, along with projection and retraction steps to bound polynomial degree and TT rank, and it uses time-adaptive Euler steps (with stiffness and error criteria) or dynamical low-rank integration to solve the tensor-valued ODE. Numerical experiments in 20 dimensions demonstrate stable rank behavior, controlled errors, and accurate reverse-time sampling for Gaussian and mixed nonlinear densities, highlighting the potential for scalable Bayesian inference without requiring normalization constants. The results suggest that the TT-based HJB solver can serve as a principled, interpretable alternative to black-box neural approaches for high-dimensional generative modeling and Bayesian sampling, with avenues for future improvements via advanced DLRA techniques and diffusion-accelerated samplers.

Abstract

Sampling from probability densities is a common challenge in fields such as Uncertainty Quantification (UQ) and Generative Modelling (GM). In GM in particular, the use of reverse-time diffusion processes depending on the log-densities of Ornstein-Uhlenbeck forward processes are a popular sampling tool. In Berner et al. [2022] the authors point out that these log-densities can be obtained by solution of a \textit{Hamilton-Jacobi-Bellman} (HJB) equation known from stochastic optimal control. While this HJB equation is usually treated with indirect methods such as policy iteration and unsupervised training of black-box architectures like Neural Networks, we propose instead to solve the HJB equation by direct time integration, using compressed polynomials represented in the Tensor Train (TT) format for spatial discretization. Crucially, this method is sample-free, agnostic to normalization constants and can avoid the curse of dimensionality due to the TT compression. We provide a complete derivation of the HJB equation's action on Tensor Train polynomials and demonstrate the performance of the proposed time-step-, rank- and degree-adaptive integration method on a nonlinear sampling task in 20 dimensions.
Paper Structure (40 sections, 10 theorems, 117 equations, 7 figures, 1 table, 3 algorithms)

This paper contains 40 sections, 10 theorems, 117 equations, 7 figures, 1 table, 3 algorithms.

Key Result

Lemma 3.1

Let $f\in\mathcal{C}^2(K)$ have FTT-rank $\bm{r}\in\mathbb{N}^{d-1}$. Then, $\operatorname{Lin}(f)$ has FTT-rank at most $2\bm{r}$.

Figures (7)

  • Figure 5.1: Development of the solution ranks and the covariance error \ref{['eq:cov_error']} over time in the Gaussian setting. Once the solution is close to convergence (in terms of the covariance error), the ranks decrease to the rank $(2,\ldots,2)$ of the potential of the standard normal distribution.
  • Figure 5.2: Approximations of the maximal absolute eigenvalues of the linearized right-hand side $|\overline{\lambda}_{t}|$ determined by the power method (left) and accordingly chosen stepsize $2\rho/|\overline{\lambda}_{t}|$ (right) over time in the Gaussian setting. Note that the eigenvalues decrease monotonically, permitting a monotonous increase of the stepsize until the maximal permitted stepsize $\tau_{\max} = 0.1$ is reached.
  • Figure 5.3: Comparison of the discretized HJB solution and its derived gradient for different maximum time steps $\tau_{\mathrm{max}}=0.1,0.01,0.001$ to the exact score in the setup of random Gaussian target distribution for $d = 10$. One can clearly see the $\mathcal{O}(\tau_{\max})$ dependence of the error in both gradient and covariance matrix of the approximate solution. Note that for small times $t \ll 1$, the discretizations for maximal stepsizes $0.1$ and $0.01$ lead to similar $L^2$-error in the gradient, which can be attributed to an equally small adaptively chosen stepsize in this early (stiff) regime (compare to the right plot in Figure \ref{['fig:gaussian_stepsize_plot']} which shows the realized step sizes for $\tau_{\max} = 0.1$).
  • Figure 5.4: Development of marginal densities (blue) and the samples produced by the corresponding reverse process defined by Algorithm \ref{['alg:reverse_sampling']} (red) in the setting of the mixed nonlinear case (Section \ref{['sec:mixed_nonlinear']}). The first row shows the values of the densities and samples on the $(x_1,x_2)$-plane, which is governed by the the Banana potential. The second row concerns the $(x_3,x_4)$-plane, which is governed by the nonsymmetric multimodal potential. The third row shows the $(x_5,x_6)$-dimension, governed by the bimodal potential. On the level of the HJB solver, the plot should be viewed from right to left since the target density (right) is transformed to a standard Gaussian (left). On the level of the reverse process, the samples (red) move from the standard Gaussian on the left to the target measure on the right. We note that in all cases the sampler is able to reproduce the multimodality and curvature of the corresponding density.
  • Figure 5.5: pproximations of the maximal absolute eigenvalues of the linearized HJB right-hand side (left) and accordingly chosen time stepsizes (right) as in Figure \ref{['fig:gaussian_stepsize_plot']} but for the mixed nonlinear potential from Section \ref{['sec:mixed_nonlinear']}. Note the jump in the stepsize at $t=10^{-6}$ which corresponds to a change in the stiffnes control parameter $\rho$. Up to small perturbations which may be attributed to inaccuracy of the power method the stepsizes are monotonically increasing again.
  • ...and 2 more figures

Theorems & Definitions (19)

  • Remark 2.1: Reverse-time Ornstein-Uhlenbeck process
  • Lemma 3.1
  • proof
  • Lemma 3.2
  • proof
  • Lemma 3.3
  • proof
  • Lemma 3.4
  • proof
  • Lemma 3.5
  • ...and 9 more