Table of Contents
Fetching ...

Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning

Thomas Baldwin-McDonald, Mauricio A. Álvarez

TL;DR

The paper proposes the Deep Latent Force Model (DLFM), a Bayesian deep learning architecture that embeds physics through ODE-derived kernels in a deep GP framework. It provides two scalable inference schemes: DLFM-RFF, which uses random Fourier features in a weight-space deep GP, and DLFM-VIP, which relies on variational inducing points with pathwise sampling to address nonstationarity and extrapolation. Empirical results show DLFM can capture nonlinear dynamics in toy and real-world time series and perform competitively on UCI benchmarks, with DLFM-RFF excelling at short-range extrapolation and DLFM-VIP offering stable uncertainty in interpolation and limited extrapolation. The work analyzes decay-parameter effects, compares inference schemes, and discusses future directions such as interdomain kernels to fuse global structure with physics-informed priors. Overall, DLFM provides a principled, scalable way to combine mechanistic insight with Bayesian deep learning for robust dynamical modeling and uncertainty quantification.

Abstract

Modelling the behaviour of highly nonlinear dynamical systems with robust uncertainty quantification is a challenging task which typically requires approaches specifically designed to address the problem at hand. We introduce a domain-agnostic model to address this issue termed the deep latent force model (DLFM), a deep Gaussian process with physics-informed kernels at each layer, derived from ordinary differential equations using the framework of process convolutions. Two distinct formulations of the DLFM are presented which utilise weight-space and variational inducing points-based Gaussian process approximations, both of which are amenable to doubly stochastic variational inference. We present empirical evidence of the capability of the DLFM to capture the dynamics present in highly nonlinear real-world multi-output time series data. Additionally, we find that the DLFM is capable of achieving comparable performance to a range of non-physics-informed probabilistic models on benchmark univariate regression tasks. We also empirically assess the negative impact of the inducing points framework on the extrapolation capabilities of LFM-based models.

Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning

TL;DR

The paper proposes the Deep Latent Force Model (DLFM), a Bayesian deep learning architecture that embeds physics through ODE-derived kernels in a deep GP framework. It provides two scalable inference schemes: DLFM-RFF, which uses random Fourier features in a weight-space deep GP, and DLFM-VIP, which relies on variational inducing points with pathwise sampling to address nonstationarity and extrapolation. Empirical results show DLFM can capture nonlinear dynamics in toy and real-world time series and perform competitively on UCI benchmarks, with DLFM-RFF excelling at short-range extrapolation and DLFM-VIP offering stable uncertainty in interpolation and limited extrapolation. The work analyzes decay-parameter effects, compares inference schemes, and discusses future directions such as interdomain kernels to fuse global structure with physics-informed priors. Overall, DLFM provides a principled, scalable way to combine mechanistic insight with Bayesian deep learning for robust dynamical modeling and uncertainty quantification.

Abstract

Modelling the behaviour of highly nonlinear dynamical systems with robust uncertainty quantification is a challenging task which typically requires approaches specifically designed to address the problem at hand. We introduce a domain-agnostic model to address this issue termed the deep latent force model (DLFM), a deep Gaussian process with physics-informed kernels at each layer, derived from ordinary differential equations using the framework of process convolutions. Two distinct formulations of the DLFM are presented which utilise weight-space and variational inducing points-based Gaussian process approximations, both of which are amenable to doubly stochastic variational inference. We present empirical evidence of the capability of the DLFM to capture the dynamics present in highly nonlinear real-world multi-output time series data. Additionally, we find that the DLFM is capable of achieving comparable performance to a range of non-physics-informed probabilistic models on benchmark univariate regression tasks. We also empirically assess the negative impact of the inducing points framework on the extrapolation capabilities of LFM-based models.
Paper Structure (37 sections, 32 equations, 6 figures, 3 tables)

This paper contains 37 sections, 32 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: A conceptual explanation of how the DLFM differs from a DGP. At each layer, we perform the operation $\mathcal{L}^{(\ell)}\{x, G\} = \int_0^x G^{(\ell)}(x-\tau)u(\tau)d\tau$, where $G$ is the Green's function corresponding to an ODE, and $u(\cdot)$ represents an exponentiated quadratic GP prior. For example, the second operation in the model shown above would take the form, $\mathcal{L}^{(2)}\{f_1, G\} = \int_0^{f_1} G^{(2)}(f_1-\tau)u(\tau)d\tau$.
  • Figure 2: An illustration of how the DLFM with random Fourier features differs from a DGP with random feature expansions, with this example containing two layers. At each layer of the DLFM-RFF, for each input dimension, $N_{RF}$ random features of the form shown in Eq. \ref{['eq:ode1_feature']} are computed for each of the $Q$ latent forces. The random feature vector $\boldsymbol{\phi}_{LFM}^{(\ell)}$ is then formed by taking the sum of these features across the input dimensions. This summation is shown in the Figure by the block containing $\Sigma$.
  • Figure 3: Model fits to our toy dynamical system with latent function $u(t)=\cos(0.5 t) + 6\sin(3 t)$. Grey, orange and red data-points represent training, imputation test and extrapolation test data respectively. Purple curves represent predictive means, and shaded areas represent $\pm 2\sigma$.
  • Figure 4: Predictions generated for each of the three outputs within the CHARIS imputation experiment, from a DLFM-RFF and DLFM-VIP. Grey dots represent training data, orange dots represent test data, the purple line represents the predictive mean of the model, and the shaded grey areas in each plot represent $\pm 2\sigma$.
  • Figure 5: Predictions generated for each of the three outputs within the CHARIS extrapolation experiment, from a DLFM-RFF and DLFM-VIP. Grey dots represent training data, orange dots represent test data, the purple line represents the predictive mean of the model, and the shaded grey areas in each plot represent $\pm 2\sigma$. Here, we have zoomed in on the final 100 timesteps which includes the extrapolation region; the previous 900 training observations are not shown.
  • ...and 1 more figures