Mixing time of the conditional backward sampling particle filter
Joona Karjalainen, Anthony Lee, Sumeetpal S. Singh, Matti Vihola
TL;DR
This paper proves that the conditional backward sampling particle filter (CBPF) achieves $O(T \log T)$ mixing time under a strong mixing condition, with a practically implementable coupling of two CBPFs driving unbiased, finite-variance estimators at cost $O(T \log T)$. It introduces several coupling strategies (joint/independent maximal couplings and index-based variants) and analyzes their computational trade-offs, showing that state-based couplings often outperform index-based ones in long horizons. The results are demonstrated through four applications, including stochastic gradient maximum likelihood and calcium imaging, highlighting the CBPF’s scalability and its potential for efficient smoothing and gradient estimation in complex HMMs. Overall, the work bridges theory and practice by delivering rigorous mixing-time guarantees for CBPF and practical unbiased estimators that can benefit a wide range of time-series inference tasks.
Abstract
The conditional backward sampling particle filter (CBPF) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model (HMM) smoothing. It was proposed as an improvement over the conditional particle filter (CPF), which has an $O(T^2)$ complexity under a general `strong' mixing assumption, where $T$ is the time horizon. Empirical evidence of the superiority of the CBPF over the CPF has never been theoretically quantified. We show that the CBPF has $O(T \log T)$ time complexity under strong mixing: its mixing time is upper bounded by $O(\log T)$, for any sufficiently large number of particles $N$ independent of $T$. This $O(\log T)$ mixing time is optimal. To prove our main result, we introduce a novel coupling of two CBPFs, which employs a maximal coupling of two particle systems at each time instant. The coupling is implementable and we use it to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state's path, with a total expected cost of $O(T \log T)$. We use this to construct unbiased estimates of the HMM's score function, and also investigate other couplings which can exhibit improved behaviour. We demonstrate our methods on financial and calcium imaging applications.
