Table of Contents
Fetching ...

System Identification for Continuous-time Linear Dynamical Systems

Peter Halmos, Jonathan Pillow, David A. Knowles

TL;DR

A novel two-filter, analytical form for the posterior with a Bayesian derivation is introduced, which yields analytical updates which do not require the forward-pass to be pre-computed, and an EM procedure which estimates the parameters of the SDE, naturally incorporating irregularly sampled measurements.

Abstract

The problem of system identification for the Kalman filter, relying on the expectation-maximization (EM) procedure to learn the underlying parameters of a dynamical system, has largely been studied assuming that observations are sampled at equally-spaced time points. However, in many applications this is a restrictive and unrealistic assumption. This paper addresses system identification for the continuous-discrete filter, with the aim of generalizing learning for the Kalman filter by relying on a solution to a continuous-time Itô stochastic differential equation (SDE) for the latent state and covariance dynamics. We introduce a novel two-filter, analytical form for the posterior with a Bayesian derivation, which yields analytical updates which do not require the forward-pass to be pre-computed. Using this analytical and efficient computation of the posterior, we provide an EM procedure which estimates the parameters of the SDE, naturally incorporating irregularly sampled measurements. Generalizing the learning of latent linear dynamical systems (LDS) to continuous-time may extend the use of the hybrid Kalman filter to data which is not regularly sampled or has intermittent missing values, and can extend the power of non-linear system identification methods such as switching LDS (SLDS), which rely on EM for the linear discrete-time Kalman filter as a sub-unit for learning locally linearized behavior of a non-linear system. We apply the method by learning the parameters of a latent, multivariate Fokker-Planck SDE representing a toggle-switch genetic circuit using biologically realistic parameters, and compare the efficacy of learning relative to the discrete-time Kalman filter as the step-size irregularity and spectral-radius of the dynamics-matrix increases.

System Identification for Continuous-time Linear Dynamical Systems

TL;DR

A novel two-filter, analytical form for the posterior with a Bayesian derivation is introduced, which yields analytical updates which do not require the forward-pass to be pre-computed, and an EM procedure which estimates the parameters of the SDE, naturally incorporating irregularly sampled measurements.

Abstract

The problem of system identification for the Kalman filter, relying on the expectation-maximization (EM) procedure to learn the underlying parameters of a dynamical system, has largely been studied assuming that observations are sampled at equally-spaced time points. However, in many applications this is a restrictive and unrealistic assumption. This paper addresses system identification for the continuous-discrete filter, with the aim of generalizing learning for the Kalman filter by relying on a solution to a continuous-time Itô stochastic differential equation (SDE) for the latent state and covariance dynamics. We introduce a novel two-filter, analytical form for the posterior with a Bayesian derivation, which yields analytical updates which do not require the forward-pass to be pre-computed. Using this analytical and efficient computation of the posterior, we provide an EM procedure which estimates the parameters of the SDE, naturally incorporating irregularly sampled measurements. Generalizing the learning of latent linear dynamical systems (LDS) to continuous-time may extend the use of the hybrid Kalman filter to data which is not regularly sampled or has intermittent missing values, and can extend the power of non-linear system identification methods such as switching LDS (SLDS), which rely on EM for the linear discrete-time Kalman filter as a sub-unit for learning locally linearized behavior of a non-linear system. We apply the method by learning the parameters of a latent, multivariate Fokker-Planck SDE representing a toggle-switch genetic circuit using biologically realistic parameters, and compare the efficacy of learning relative to the discrete-time Kalman filter as the step-size irregularity and spectral-radius of the dynamics-matrix increases.
Paper Structure (18 sections, 12 theorems, 228 equations, 4 figures, 2 algorithms)

This paper contains 18 sections, 12 theorems, 228 equations, 4 figures, 2 algorithms.

Key Result

Proposition 1

Given system parameters $(\mathbf{Q}_{c}, \mathbf{A}, \mathbf{H})$, and $\bm{V}(t)$, $\bm{Q}(t)$ as time-dependent covariance functions defined in eqn:Vt, eqn:Qt, then the recursive likelihood eqn:recursive_main is distributed with the following mean $\mathbf{\mu}_{k+1}^{b/+}$ and covariance $\mathb For a backwards-direction gain-matrix.

Figures (4)

  • Figure 1: Isocontours of the time-dependent covariance $\bm{Q}(s)$ centered about the state-mean $\bm{x}(s)^{f/-} = e^{\bm{A}s}\bm{x}(t_{k-1})$. Snapshots show the increase in uncertainty of the time-dependent covariance between measurements and collapse to a certain state following measurement.
  • Figure 2: Comparison of model error for uniformly-random observation intervals, given by \ref{['eqn:loss1']} and \ref{['eqn:loss2']} in a) and \ref{['eqn:loss3']} and \ref{['eqn:loss4']} in b). Discrete-time parameters were learned using https://github.com/pykalman/pykalman and the continuous-time parameters using Algorithm \ref{['alg:cap']}. Both were capped at 100 EM iterations. Mean and error bars representing the inter-quartile (IQR) ranges and whiskers extending the box by 1.5 $\times$ IQR shown over 100 different simulations of the system. The spectral radius ranges from $\rho(\mathbf{A}) \approx 0.03$ to $\rho(30 \times \mathbf{A}) \approx 1$ (relatively stable).
  • Figure 3: A comparison of the Frobenius-norm error for the example of Beta-distributed observation intervals with variance-controlling parameter $\gamma$. Loss given by \ref{['eqn:loss1']} and \ref{['eqn:loss2']} in a) and \ref{['eqn:loss3']} and \ref{['eqn:loss4']} in b. Parameters were learned with a maximimum of 100 EM iterations per simulation across 100 random datasets. The Beta distribution is scaled to be in the interval $[0,1/2]$ with 40 time-steps total so that the expected time is 10 minutes, and $\gamma$ ranges from $1/2$ to $10000$, $\mathbf{A}$ is homogeneously scaled so $\rho(\mathbf{A}) = 1$.
  • Figure 4: Distributions for different values of the time-step variance-parameter $\gamma$, illustrating how varying $\gamma$ will alter the variance of the observation intervals $\tau \sim \textit{Beta}\left( \gamma, \gamma \right)$.

Theorems & Definitions (19)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Lemma 1
  • Proposition 4
  • Proposition 5
  • proof
  • Proposition 6
  • proof
  • Proposition 7
  • ...and 9 more