Table of Contents
Fetching ...

Mixing time of the conditional backward sampling particle filter

Joona Karjalainen, Anthony Lee, Sumeetpal S. Singh, Matti Vihola

TL;DR

This paper proves that the conditional backward sampling particle filter (CBPF) achieves $O(T \log T)$ mixing time under a strong mixing condition, with a practically implementable coupling of two CBPFs driving unbiased, finite-variance estimators at cost $O(T \log T)$. It introduces several coupling strategies (joint/independent maximal couplings and index-based variants) and analyzes their computational trade-offs, showing that state-based couplings often outperform index-based ones in long horizons. The results are demonstrated through four applications, including stochastic gradient maximum likelihood and calcium imaging, highlighting the CBPF’s scalability and its potential for efficient smoothing and gradient estimation in complex HMMs. Overall, the work bridges theory and practice by delivering rigorous mixing-time guarantees for CBPF and practical unbiased estimators that can benefit a wide range of time-series inference tasks.

Abstract

The conditional backward sampling particle filter (CBPF) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model (HMM) smoothing. It was proposed as an improvement over the conditional particle filter (CPF), which has an $O(T^2)$ complexity under a general `strong' mixing assumption, where $T$ is the time horizon. Empirical evidence of the superiority of the CBPF over the CPF has never been theoretically quantified. We show that the CBPF has $O(T \log T)$ time complexity under strong mixing: its mixing time is upper bounded by $O(\log T)$, for any sufficiently large number of particles $N$ independent of $T$. This $O(\log T)$ mixing time is optimal. To prove our main result, we introduce a novel coupling of two CBPFs, which employs a maximal coupling of two particle systems at each time instant. The coupling is implementable and we use it to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state's path, with a total expected cost of $O(T \log T)$. We use this to construct unbiased estimates of the HMM's score function, and also investigate other couplings which can exhibit improved behaviour. We demonstrate our methods on financial and calcium imaging applications.

Mixing time of the conditional backward sampling particle filter

TL;DR

This paper proves that the conditional backward sampling particle filter (CBPF) achieves mixing time under a strong mixing condition, with a practically implementable coupling of two CBPFs driving unbiased, finite-variance estimators at cost . It introduces several coupling strategies (joint/independent maximal couplings and index-based variants) and analyzes their computational trade-offs, showing that state-based couplings often outperform index-based ones in long horizons. The results are demonstrated through four applications, including stochastic gradient maximum likelihood and calcium imaging, highlighting the CBPF’s scalability and its potential for efficient smoothing and gradient estimation in complex HMMs. Overall, the work bridges theory and practice by delivering rigorous mixing-time guarantees for CBPF and practical unbiased estimators that can benefit a wide range of time-series inference tasks.

Abstract

The conditional backward sampling particle filter (CBPF) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model (HMM) smoothing. It was proposed as an improvement over the conditional particle filter (CPF), which has an complexity under a general `strong' mixing assumption, where is the time horizon. Empirical evidence of the superiority of the CBPF over the CPF has never been theoretically quantified. We show that the CBPF has time complexity under strong mixing: its mixing time is upper bounded by , for any sufficiently large number of particles independent of . This mixing time is optimal. To prove our main result, we introduce a novel coupling of two CBPFs, which employs a maximal coupling of two particle systems at each time instant. The coupling is implementable and we use it to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state's path, with a total expected cost of . We use this to construct unbiased estimates of the HMM's score function, and also investigate other couplings which can exhibit improved behaviour. We demonstrate our methods on financial and calcium imaging applications.
Paper Structure (25 sections, 25 theorems, 134 equations, 11 figures, 6 algorithms)

This paper contains 25 sections, 25 theorems, 134 equations, 11 figures, 6 algorithms.

Key Result

Theorem 1

There exist finite constants $N_{\rm min}$ and $c_r$ which depend only on the constants in a:strong-mixing, such that for all $N\ge N_{\rm min}$, any $T\ge 1$, any initial state $x_{1:T}^*\in\mathsf{X}^T$ and $k\ge 1$, the following upper bound for the total variation distance holds:

Figures (11)

  • Figure 1: Average coupling times in Algorithm \ref{['alg:unbiased']} (using IMC and JMC) for the barriers model described in Section \ref{['sec:strongmixing']}. The experiments that failed to complete within the time limit of 4 hours (with $a$ = 0.1, $b$ = 0.5) are omitted from the graphs.
  • Figure 2: Average coupling times in Algorithm \ref{['alg:unbiased']} (using IIC and JIC) for the barriers model described in Section \ref{['sec:strongmixing']}. As above, the experiments that failed to complete within the time limit of 4 hours are omitted from the graphs. The grey areas illustrate a linear growth rate.
  • Figure 3: Average cost factors for Algorithm \ref{['alg:unbiased']} in the barriers model described in Section \ref{['sec:strongmixing']} with $a=0.1$. The experiments that failed to complete within the time limit of 4 hours are omitted from the graphs.
  • Figure 4: Average cost factors for Algorithm \ref{['alg:unbiased']} in the linear-Gaussian model described in Section \ref{['sec:lgmodel']} with parameter configurations $\theta_1$, $\theta_2$ and $\theta_3$. The experiments that failed to complete within the time limit of 8 hours are omitted from the graphs.
  • Figure 5: Illustration of the couplings as Algorithm \ref{['alg:mc_cbpf']} is iterated for the linear-Gaussian model (parameter configuration $\theta_3$) described in Section \ref{['sec:lgmodel']}. Uncoupled states, i.e. $X_t(i) \neq \tilde{X}_t(i)$, are shown as black pixels.
  • ...and 6 more figures

Theorems & Definitions (48)

  • Theorem 1
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Remark 4
  • Theorem 5
  • Corollary 6
  • proof
  • proof : Proof of Theorem \ref{['thm:cbpf-mixing']}
  • ...and 38 more