Table of Contents
Fetching ...

Denoising Particle Filters: Learning State Estimation with Single-Step Objectives

Lennart Röstel, Berthold Bäuml

TL;DR

This work addresses robust state estimation for robotics under partial observability and non-linear dynamics. It introduces Denoising Particle Filters (DnPF), which learns state transition and measurement models using single-step objectives and performs posterior sampling with a diffusion-based denoising process guided by a learned dynamics prior and a measurement likelihood score. A key contribution is a likelihood-constrained diffusion mechanism that keeps particles close to the data manifold, enabling modular sensor fusion without retraining. Across simulated tasks in Mujoco, DnPF achieves competitive accuracy with end-to-end baselines and shows improved robustness to distribution shifts and easy integration of external sensor models.

Abstract

Learning-based methods commonly treat state estimation in robotics as a sequence modeling problem. While this paradigm can be effective at maximizing end-to-end performance, models are often difficult to interpret and expensive to train, since training requires unrolling sequences of predictions in time. As an alternative to end-to-end trained state estimation, we propose a novel particle filtering algorithm in which models are trained from individual state transitions, fully exploiting the Markov property in robotic systems. In this framework, measurement models are learned implicitly by minimizing a denoising score matching objective. At inference, the learned denoiser is used alongside a (learned) dynamics model to approximately solve the Bayesian filtering equation at each time step, effectively guiding predicted states toward the data manifold informed by measurements. We evaluate the proposed method on challenging robotic state estimation tasks in simulation, demonstrating competitive performance compared to tuned end-to-end trained baselines. Importantly, our method offers the desirable composability of classical filtering algorithms, allowing prior information and external sensor models to be incorporated without retraining.

Denoising Particle Filters: Learning State Estimation with Single-Step Objectives

TL;DR

This work addresses robust state estimation for robotics under partial observability and non-linear dynamics. It introduces Denoising Particle Filters (DnPF), which learns state transition and measurement models using single-step objectives and performs posterior sampling with a diffusion-based denoising process guided by a learned dynamics prior and a measurement likelihood score. A key contribution is a likelihood-constrained diffusion mechanism that keeps particles close to the data manifold, enabling modular sensor fusion without retraining. Across simulated tasks in Mujoco, DnPF achieves competitive accuracy with end-to-end baselines and shows improved robustness to distribution shifts and easy integration of external sensor models.

Abstract

Learning-based methods commonly treat state estimation in robotics as a sequence modeling problem. While this paradigm can be effective at maximizing end-to-end performance, models are often difficult to interpret and expensive to train, since training requires unrolling sequences of predictions in time. As an alternative to end-to-end trained state estimation, we propose a novel particle filtering algorithm in which models are trained from individual state transitions, fully exploiting the Markov property in robotic systems. In this framework, measurement models are learned implicitly by minimizing a denoising score matching objective. At inference, the learned denoiser is used alongside a (learned) dynamics model to approximately solve the Bayesian filtering equation at each time step, effectively guiding predicted states toward the data manifold informed by measurements. We evaluate the proposed method on challenging robotic state estimation tasks in simulation, demonstrating competitive performance compared to tuned end-to-end trained baselines. Importantly, our method offers the desirable composability of classical filtering algorithms, allowing prior information and external sensor models to be incorporated without retraining.
Paper Structure (19 sections, 16 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 19 sections, 16 equations, 8 figures, 1 table, 1 algorithm.

Figures (8)

  • Figure 1: One timestep of inference with Denoising Particle Filters (DnPF). DnPF approximates the posterior $p(x_t|y_{1:t}, u_{1:t})$ as a set of particles $\{x_{t}^{i}\}_{i=1}^N$ and recursively solves the Bayesian filtering equation in score space. At each timestep $t$, each particle undergoes a series of integration steps $s = 0\rightarrow 1$, moving according to the sum of score terms for dynamics $\nabla\log p_s(x_t|x_{t-1}, u_t)$, data likelihood score $\nabla\log p_s(x_t|y_t)$, and, optionally, a known external sensor model $\nabla\log p_s(\hat{y}_t | x_t)$. Instead of starting from pure noise at each timestep, particles are warm-started with noise-perturbed predictions from the (learned) dynamics model (top). The data likelihood score $\nabla\log p_s(x_t|y_t)$ is predicted by a score network$D(x_t, y_t, s)$, which can be trained efficiently via denoising score matching.
  • Figure 2: DnPF network structure for efficient denoising inference.
  • Figure 3: Simulated state estimation tasks used in the experiments. a) In the Manipulator Spin task a 7-dof manipulator interacts with an object on a spinning table. b) Multi-fingered Manipulation: a 12-dof robotic hand manipulates a rigid object. c)Cluttered Push a manipulator pushes around 3 cylindrical objects on a plane. d) Image observation showing a top-down view with occlusions of the Cluttered Push task.
  • Figure 4: DnPF predictions for the Manipulator Spin task. Shown are estimates for the normalized X-component of the object position, ground truth shown in red. At $\sim 2.5\text{s}$, the manipulator end-effector contacts the object, greatly reducing the space of possible object configurations. After being pulled toward the induced measurement likelihood, DnPF particles follow the (nearly deterministic) dynamics model again, tracking the ground truth closely. Top: Particle rollout without likelihood constraint, Middle: rollout with likelihood constraint ($\theta = 2.0$). Bottom: Magnitude of predicted measurement score $|\epsilon_{\text{lh}}|$.
  • Figure 5: Particle predictions in the Multi-fingered Manipulation task, for one normalized position component (top row) and orientation component (bottom row) respectively. Ground truth is shown in red. The first column shows particles of an open-loop rollout using only the learned dynamics model $f$. The second column shows a particle rollout of DnPF using only proprioceptive measurements. The particle distribution for the rotation component is multi-modal due to the rotational symmetry of the object. In the last column, a simulated external sensor model is additionally included in the DnPF updates, providing (noisy) measurement of rotation $\hat{y}_{\text{rot}} = X_{\text{rot}} + \epsilon$ with $\epsilon \sim \mathcal{N}(0, 0.1^2 \mathrm{I})$.
  • ...and 3 more figures