Table of Contents
Fetching ...

DiffPF: Differentiable Particle Filtering with Generative Sampling via Conditional Diffusion Models

Ziyu Wan, Lin Zhao

TL;DR

DiffPF addresses state estimation in nonlinear, high-dimensional, and multimodal dynamics by introducing a conditional diffusion model that learns a flexible posterior sampler conditioned on predicted particles and current observations. It replaces traditional importance weighting and resampling with a diffusion-based update that yields equally weighted samples drawn from $p(\bm{x}_t \mid \bm{\c}_t)$, enabling fully differentiable end-to-end training. Across synthetic and real-world benchmarks, including highly multimodal global localization and KITTI visual odometry, DiffPF outperforms state-of-the-art differentiable filters, often with far fewer particles. The approach demonstrates that conditional diffusion samplers can produce high-quality posterior representations in complex filtering tasks, with practical implications for robotics and perception.

Abstract

This paper proposes DiffPF, a differentiable particle filter that leverages diffusion models for state estimation in dynamic systems. Unlike conventional differentiable particle filters, which require importance weighting and typically rely on predefined or low-capacity proposal distributions. DiffPF learns a flexible posterior sampler by conditioning a diffusion model on predicted particles and the current observation. This enables accurate, equally-weighted sampling from complex, high-dimensional, and multimodal filtering distributions. We evaluate DiffPF across a range of scenarios, including both unimodal and highly multimodal distributions, and test it on simulated as well as real-world tasks, where it consistently outperforms existing filtering baselines. In particular, DiffPF achieves an 82.8% improvement in estimation accuracy on a highly multimodal global localization benchmark, and a 26% improvement on the real-world KITTI visual odometry benchmark, compared to state-of-the-art differentiable filters. To the best of our knowledge, DiffPF is the first method to integrate conditional diffusion models into particle filtering, enabling high-quality posterior sampling that produces more informative particles and significantly improves state estimation.

DiffPF: Differentiable Particle Filtering with Generative Sampling via Conditional Diffusion Models

TL;DR

DiffPF addresses state estimation in nonlinear, high-dimensional, and multimodal dynamics by introducing a conditional diffusion model that learns a flexible posterior sampler conditioned on predicted particles and current observations. It replaces traditional importance weighting and resampling with a diffusion-based update that yields equally weighted samples drawn from , enabling fully differentiable end-to-end training. Across synthetic and real-world benchmarks, including highly multimodal global localization and KITTI visual odometry, DiffPF outperforms state-of-the-art differentiable filters, often with far fewer particles. The approach demonstrates that conditional diffusion samplers can produce high-quality posterior representations in complex filtering tasks, with practical implications for robotics and perception.

Abstract

This paper proposes DiffPF, a differentiable particle filter that leverages diffusion models for state estimation in dynamic systems. Unlike conventional differentiable particle filters, which require importance weighting and typically rely on predefined or low-capacity proposal distributions. DiffPF learns a flexible posterior sampler by conditioning a diffusion model on predicted particles and the current observation. This enables accurate, equally-weighted sampling from complex, high-dimensional, and multimodal filtering distributions. We evaluate DiffPF across a range of scenarios, including both unimodal and highly multimodal distributions, and test it on simulated as well as real-world tasks, where it consistently outperforms existing filtering baselines. In particular, DiffPF achieves an 82.8% improvement in estimation accuracy on a highly multimodal global localization benchmark, and a 26% improvement on the real-world KITTI visual odometry benchmark, compared to state-of-the-art differentiable filters. To the best of our knowledge, DiffPF is the first method to integrate conditional diffusion models into particle filtering, enabling high-quality posterior sampling that produces more informative particles and significantly improves state estimation.

Paper Structure

This paper contains 15 sections, 11 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: A conditional diffusion model parameterized by a U-Net is used to model the filtering posterior distribution and generate equally weighted particles samples via iterative denoising.
  • Figure 2: At time $t$, the estimated particles from time $t{-}1$ are propagated through the process model to obtain a prior distribution over the current state. Simultaneously, the observation $\bm{o}_t$ is encoded into a feature representation. These two sources of information jointly condition a diffusion model, which iteratively refines a set of noisy latent samples to generate equally weighted particles that approximate the filtering posterior. Unlike DnD Filter wan2025dnd, which uses a diffusion model in a Kalman filter–like manner to fuse prior and observation into a single trajectory, DiffPF maintains a full set of particles to explicitly represent the posterior distribution.
  • Figure 3: A sequence of three frames with the target disk indicated by a yellow box. The frames illustrate three different levels of occlusion caused by interfering disks.
  • Figure 4: (a) An example observation captured by the robot. (b) The red arrow indicates the true pose, while the blue arrows mark alternative poses in the map that yield similar observations, illustrating the multimodal nature of the observation-to-state mapping.
  • Figure 5: Estimated poses at selected time steps from a sequence using DiffPF. Red arrows indicate the ground truth, green arrow represent the estimated pose, and blue arrows represent the particle set.
  • ...and 2 more figures