Table of Contents
Fetching ...

Nesting Particle Filters for Experimental Design in Dynamical Systems

Sahel Iqbal, Adrien Corenflos, Simo Särkkä, Hany Abdulsamad

TL;DR

The paper addresses Bayesian experimental design for non-exchangeable dynamical systems by framing design optimization as risk-sensitive policy learning and introducing Inside-Out SMC$^2$, a nested SMC method embedded in a particle MCMC framework to amortize the optimal design policy. The method reframes the EIG objective as a risk-sensitive inference problem with a non-Markovian trajectory model and uses an IBIS-based inner loop to approximate the filtered posterior over parameters, enabling gradient-based policy optimization via the score. It demonstrates superior, sample-efficient estimation of the EIG compared to sPCE-based methods across stochastic pendulum, cart-pole, and double-link systems, and provides analysis of scalability, limitations (requiring closed-form conditional densities) and tempering. The work promises practical impact for real-time sequential design in complex dynamical systems where non-exchangeability and long horizons hinder traditional BED approaches.

Abstract

In this paper, we propose a novel approach to Bayesian experimental design for non-exchangeable data that formulates it as risk-sensitive policy optimization. We develop the Inside-Out SMC$^2$ algorithm, a nested sequential Monte Carlo technique to infer optimal designs, and embed it into a particle Markov chain Monte Carlo framework to perform gradient-based policy amortization. Our approach is distinct from other amortized experimental design techniques, as it does not rely on contrastive estimators. Numerical validation on a set of dynamical systems showcases the efficacy of our method in comparison to other state-of-the-art strategies.

Nesting Particle Filters for Experimental Design in Dynamical Systems

TL;DR

The paper addresses Bayesian experimental design for non-exchangeable dynamical systems by framing design optimization as risk-sensitive policy learning and introducing Inside-Out SMC, a nested SMC method embedded in a particle MCMC framework to amortize the optimal design policy. The method reframes the EIG objective as a risk-sensitive inference problem with a non-Markovian trajectory model and uses an IBIS-based inner loop to approximate the filtered posterior over parameters, enabling gradient-based policy optimization via the score. It demonstrates superior, sample-efficient estimation of the EIG compared to sPCE-based methods across stochastic pendulum, cart-pole, and double-link systems, and provides analysis of scalability, limitations (requiring closed-form conditional densities) and tempering. The work promises practical impact for real-time sequential design in complex dynamical systems where non-exchangeability and long horizons hinder traditional BED approaches.

Abstract

In this paper, we propose a novel approach to Bayesian experimental design for non-exchangeable data that formulates it as risk-sensitive policy optimization. We develop the Inside-Out SMC algorithm, a nested sequential Monte Carlo technique to infer optimal designs, and embed it into a particle Markov chain Monte Carlo framework to perform gradient-based policy amortization. Our approach is distinct from other amortized experimental design techniques, as it does not rely on contrastive estimators. Numerical validation on a set of dynamical systems showcases the efficacy of our method in comparison to other state-of-the-art strategies.
Paper Structure (31 sections, 2 theorems, 75 equations, 8 figures, 10 tables, 6 algorithms)

This paper contains 31 sections, 2 theorems, 75 equations, 8 figures, 10 tables, 6 algorithms.

Key Result

Proposition 1

For models specified by the joint density in eq:joint_density, the expected information gain factorizes to where $r_{t}(z_{0:t})$ is a stage reward defined as with $\alpha_{t}(z_{0:t})$ and $\beta_{t}(z_{0:t})$ defined as Furthermore, for models with additive, constant noise in the dynamics, the EIG can be written as where '$\equiv$' denotes equality up to an additive constant.

Figures (8)

  • Figure 1: Accumulation of the information gain computed in closed form for different policies on the conditionally linear stochastic pendulum with a Gaussian prior. We report the mean and standard deviation over $512$ realizations.
  • Figure 2: Training progression of the IO-SMC2 policy and its exact variant on the conditionally linear stochastic pendulum. At every epoch, we evaluate the EIG estimate using the mean policy. We report the mean and standard deviation over $25$ seeds.
  • Figure 3: A sample experiment trajectory generated by the amortized policy during deployment on the nonlinear stochastic pendulum environment. $q$ is the angle of the pendulum from the vertical, $\dot{q}$ is the angular velocity and $\xi$ is the design.
  • Figure 4: A sample experiment trajectory generated by the policy during deployment on the stochastic cart-pole environment. Here, $s$ and $\dot{s}$ are the position and velocity of the cart respectively, $q$ is the angle of the pole, $\dot{q}$ is its angular velocity and $\xi$ is the design.
  • Figure 5: A sample experiment trajectory generated by the policy during deployment on the stochastic double-link environment. $q_1$ and $q_2$ are the angles from the vertical for the two links, $\dot{q}_1$ and $\dot{q}_2$ their respective angular velocities, and $\xi_1$ and $\xi_2$ are the designs.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Proposition 1
  • Proposition 2: Consistency of the target distribution
  • proof
  • proof