Mixing time of the conditional backward sampling particle filter

Joona Karjalainen; Anthony Lee; Sumeetpal S. Singh; Matti Vihola

Mixing time of the conditional backward sampling particle filter

Joona Karjalainen, Anthony Lee, Sumeetpal S. Singh, Matti Vihola

TL;DR

This paper proves that the conditional backward sampling particle filter (CBPF) achieves $O(T \log T)$ mixing time under a strong mixing condition, with a practically implementable coupling of two CBPFs driving unbiased, finite-variance estimators at cost $O(T \log T)$. It introduces several coupling strategies (joint/independent maximal couplings and index-based variants) and analyzes their computational trade-offs, showing that state-based couplings often outperform index-based ones in long horizons. The results are demonstrated through four applications, including stochastic gradient maximum likelihood and calcium imaging, highlighting the CBPF’s scalability and its potential for efficient smoothing and gradient estimation in complex HMMs. Overall, the work bridges theory and practice by delivering rigorous mixing-time guarantees for CBPF and practical unbiased estimators that can benefit a wide range of time-series inference tasks.

Abstract

The conditional backward sampling particle filter (CBPF) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model (HMM) smoothing. It was proposed as an improvement over the conditional particle filter (CPF), which has an $O(T^2)$ complexity under a general `strong' mixing assumption, where $T$ is the time horizon. Empirical evidence of the superiority of the CBPF over the CPF has never been theoretically quantified. We show that the CBPF has $O(T \log T)$ time complexity under strong mixing: its mixing time is upper bounded by $O(\log T)$, for any sufficiently large number of particles $N$ independent of $T$. This $O(\log T)$ mixing time is optimal. To prove our main result, we introduce a novel coupling of two CBPFs, which employs a maximal coupling of two particle systems at each time instant. The coupling is implementable and we use it to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state's path, with a total expected cost of $O(T \log T)$. We use this to construct unbiased estimates of the HMM's score function, and also investigate other couplings which can exhibit improved behaviour. We demonstrate our methods on financial and calcium imaging applications.

Mixing time of the conditional backward sampling particle filter

TL;DR

This paper proves that the conditional backward sampling particle filter (CBPF) achieves

mixing time under a strong mixing condition, with a practically implementable coupling of two CBPFs driving unbiased, finite-variance estimators at cost

. It introduces several coupling strategies (joint/independent maximal couplings and index-based variants) and analyzes their computational trade-offs, showing that state-based couplings often outperform index-based ones in long horizons. The results are demonstrated through four applications, including stochastic gradient maximum likelihood and calcium imaging, highlighting the CBPF’s scalability and its potential for efficient smoothing and gradient estimation in complex HMMs. Overall, the work bridges theory and practice by delivering rigorous mixing-time guarantees for CBPF and practical unbiased estimators that can benefit a wide range of time-series inference tasks.

Abstract

complexity under a general `strong' mixing assumption, where

is the time horizon. Empirical evidence of the superiority of the CBPF over the CPF has never been theoretically quantified. We show that the CBPF has

time complexity under strong mixing: its mixing time is upper bounded by

, for any sufficiently large number of particles

independent of

. This

mixing time is optimal. To prove our main result, we introduce a novel coupling of two CBPFs, which employs a maximal coupling of two particle systems at each time instant. The coupling is implementable and we use it to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state's path, with a total expected cost of

. We use this to construct unbiased estimates of the HMM's score function, and also investigate other couplings which can exhibit improved behaviour. We demonstrate our methods on financial and calcium imaging applications.

Paper Structure (25 sections, 25 theorems, 134 equations, 11 figures, 6 algorithms)

This paper contains 25 sections, 25 theorems, 134 equations, 11 figures, 6 algorithms.

Introduction
Conditional backward sampling particle filter
Coupling of CBPF transitions
Alternative couplings and unbiased estimators
Joint maximal coupling
Independent maximal coupling
Independent index coupling
Joint index coupling
Computational considerations and hybrid strategies
Potentials with pairwise dependencies
Experiments
Barriers on a torus
Linear-Gaussian model
Stochastic gradient maximum likelihood
Calcium fluorescence imaging
...and 10 more sections

Key Result

Theorem 1

There exist finite constants $N_{\rm min}$ and $c_r$ which depend only on the constants in a:strong-mixing, such that for all $N\ge N_{\rm min}$, any $T\ge 1$, any initial state $x_{1:T}^*\in\mathsf{X}^T$ and $k\ge 1$, the following upper bound for the total variation distance holds:

Figures (11)

Figure 1: Average coupling times in Algorithm \ref{['alg:unbiased']} (using IMC and JMC) for the barriers model described in Section \ref{['sec:strongmixing']}. The experiments that failed to complete within the time limit of 4 hours (with $a$ = 0.1, $b$ = 0.5) are omitted from the graphs.
Figure 2: Average coupling times in Algorithm \ref{['alg:unbiased']} (using IIC and JIC) for the barriers model described in Section \ref{['sec:strongmixing']}. As above, the experiments that failed to complete within the time limit of 4 hours are omitted from the graphs. The grey areas illustrate a linear growth rate.
Figure 3: Average cost factors for Algorithm \ref{['alg:unbiased']} in the barriers model described in Section \ref{['sec:strongmixing']} with $a=0.1$. The experiments that failed to complete within the time limit of 4 hours are omitted from the graphs.
Figure 4: Average cost factors for Algorithm \ref{['alg:unbiased']} in the linear-Gaussian model described in Section \ref{['sec:lgmodel']} with parameter configurations $\theta_1$, $\theta_2$ and $\theta_3$. The experiments that failed to complete within the time limit of 8 hours are omitted from the graphs.
Figure 5: Illustration of the couplings as Algorithm \ref{['alg:mc_cbpf']} is iterated for the linear-Gaussian model (parameter configuration $\theta_3$) described in Section \ref{['sec:lgmodel']}. Uncoupled states, i.e. $X_t(i) \neq \tilde{X}_t(i)$, are shown as black pixels.
...and 6 more figures

Theorems & Definitions (48)

Theorem 1
Proposition 2
proof
Proposition 3
proof
Remark 4
Theorem 5
Corollary 6
proof
proof : Proof of Theorem \ref{['thm:cbpf-mixing']}
...and 38 more

Mixing time of the conditional backward sampling particle filter

TL;DR

Abstract

Mixing time of the conditional backward sampling particle filter

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (48)