Table of Contents
Fetching ...

Simulation-based inference via telescoping ratio estimation for trawl processes

Dan Leonte, Raphaël Huser, Almut E. D. Veraart

TL;DR

The paper tackles the challenge of inference for intractable, non-Gaussian time series by introducing a fast, sample-efficient simulation-based framework. It builds on neural ratio estimation (NRE) but overcomes its calibration and mode-mapping limitations by developing telescoping ratio estimation (TRE), which decomposes the likelihood ratio into a product of one-dimensional conditionals and learns them sequentially. Posterior sampling is then performed without MCMC through Chebyshev polynomial approximations of the conditional densities, enabling GPU-accelerated, scalable inference and straightforward posterior diagnostics. The authors also develop calibration procedures and per-parameter diagnostics that improve credible coverage and support amortization across sequence lengths, demonstrated on trawl processes and energy-demand data. Collectively, TRE offers a robust, efficient SBI tool for complex stochastic processes with rich dependence structures and non-Gaussian marginals, with potential extensions to spatio-temporal ambit fields and non-stationary settings.

Abstract

The growing availability of large and complex datasets has increased interest in temporal stochastic processes that can capture stylized facts such as marginal skewness, non-Gaussian tails, long memory, and even non-Markovian dynamics. While such models are often easy to simulate from, parameter estimation remains challenging. Simulation-based inference (SBI) offers a promising way forward, but existing methods typically require large training datasets or complex architectures and frequently yield confidence (credible) regions that fail to attain their nominal values, raising doubts on the reliability of estimates for the very features that motivate the use of these models. To address these challenges, we propose a fast and accurate, sample-efficient SBI framework for amortized posterior inference applicable to intractable stochastic processes. The proposed approach relies on two main steps: first, we learn the posterior density by decomposing it sequentially across parameter dimensions. Then, we use Chebyshev polynomial approximations to efficiently generate independent posterior samples, enabling accurate inference even when Markov chain Monte Carlo methods mix poorly. We further develop novel diagnostic tools for SBI in this context, as well as post-hoc calibration techniques; the latter not only lead to performance improvements of the learned inferential tool, but also to the ability to reuse it directly with new time series of varying lengths, thus amortizing the training cost. We demonstrate the method's effectiveness on trawl processes, a class of flexible infinitely divisible models that generalize univariate Gaussian processes, applied to energy demand data.

Simulation-based inference via telescoping ratio estimation for trawl processes

TL;DR

The paper tackles the challenge of inference for intractable, non-Gaussian time series by introducing a fast, sample-efficient simulation-based framework. It builds on neural ratio estimation (NRE) but overcomes its calibration and mode-mapping limitations by developing telescoping ratio estimation (TRE), which decomposes the likelihood ratio into a product of one-dimensional conditionals and learns them sequentially. Posterior sampling is then performed without MCMC through Chebyshev polynomial approximations of the conditional densities, enabling GPU-accelerated, scalable inference and straightforward posterior diagnostics. The authors also develop calibration procedures and per-parameter diagnostics that improve credible coverage and support amortization across sequence lengths, demonstrated on trawl processes and energy-demand data. Collectively, TRE offers a robust, efficient SBI tool for complex stochastic processes with rich dependence structures and non-Gaussian marginals, with potential extensions to spatio-temporal ambit fields and non-stationary settings.

Abstract

The growing availability of large and complex datasets has increased interest in temporal stochastic processes that can capture stylized facts such as marginal skewness, non-Gaussian tails, long memory, and even non-Markovian dynamics. While such models are often easy to simulate from, parameter estimation remains challenging. Simulation-based inference (SBI) offers a promising way forward, but existing methods typically require large training datasets or complex architectures and frequently yield confidence (credible) regions that fail to attain their nominal values, raising doubts on the reliability of estimates for the very features that motivate the use of these models. To address these challenges, we propose a fast and accurate, sample-efficient SBI framework for amortized posterior inference applicable to intractable stochastic processes. The proposed approach relies on two main steps: first, we learn the posterior density by decomposing it sequentially across parameter dimensions. Then, we use Chebyshev polynomial approximations to efficiently generate independent posterior samples, enabling accurate inference even when Markov chain Monte Carlo methods mix poorly. We further develop novel diagnostic tools for SBI in this context, as well as post-hoc calibration techniques; the latter not only lead to performance improvements of the learned inferential tool, but also to the ability to reuse it directly with new time series of varying lengths, thus amortizing the training cost. We demonstrate the method's effectiveness on trawl processes, a class of flexible infinitely divisible models that generalize univariate Gaussian processes, applied to energy demand data.

Paper Structure

This paper contains 42 sections, 2 theorems, 44 equations, 11 figures, 7 tables, 3 algorithms.

Key Result

Theorem 3.1

Let $q_0(\boldsymbol{x},\boldsymbol{\theta}) = p(\boldsymbol{x})p(\boldsymbol{\theta})$, $q_m(\boldsymbol{x},\boldsymbol{\theta}) = p(\boldsymbol{x},\boldsymbol{\theta})$ and $q_i(\boldsymbol{x},\boldsymbol{\theta}) =p(\boldsymbol{x},\boldsymbol{\theta}^{1:i}) p(\boldsymbol{\theta}^{i+1:m})$ for $i

Figures (11)

  • Figure 1: TRE architectures for learning likelihood ratios. Outputs are on the log scale for stability. (a) Independent classifiers: Each encoder (e.g., Long Short-Term Memory, or LSTM) processes the data (e.g., time series) $\boldsymbol{x}$ into summary statistics $s_i(\boldsymbol{x})$, which are concatenated with $\boldsymbol{\theta}$ and passed to a multilayer perceptron (MLP) to approximate $\log{\hat{p}(\theta^{i}\mid \boldsymbol{x}, \boldsymbol{\theta}^{1:i-1} )}$. This is added to $-\log{p(\theta^i \mid \boldsymbol{\theta}^{i+1:m})}$ to yield $\log \hat{r}_i(\boldsymbol{x},\boldsymbol{\theta})$. (b) Shared encoder variant: All classifiers share a single encoder with separate MLP heads; see Section \ref{['SM:comparison_with_original_TRE']}. The $\oplus$ symbol indicates addition by $-\log p(\theta^i | \boldsymbol{\theta}^{i+1:m})$.
  • Figure 2: BCE, $\mathcal{S}$, accuracy, and $\mathcal{B}$ metrics (left to right) for the $\textrm{ACF},\, \mu,\, \sigma$ and $\beta$ classifiers (different colored lines), evaluated on a holdout dataset over the last 35000 training iterations. We train the classifiers with trawl process realizations $\boldsymbol{x}$ of length $1500$. The legend is displayed in the right panel.
  • Figure 3: Performance comparison of point estimators given by TRE, NRE and GMM. Left: trawl process realization corresponding to $\boldsymbol{\theta} = (13.36, 15.52, 0.97, 0.98, -0.17)$; middle: true (dashed) and inferred (solid) marginal distributions; right: true (dashed), empirical (solid-dotted), and infered (solid) ACFs.
  • Figure 4: Comparison of the coverage deviation $\mathcal{C}_{\alpha} - \alpha$ before and after beta-calibration. Positive values indicate underconfidence, while negative ones indicate overconfidence. Top row: NRE and TRE. Bottom row: component NREs within TRE. Beta-calibration yields near-perfect coverage for the TRE, consistently improving TRE performance across all lengths $1000$, $1500$ and $2000$.
  • Figure 5: Arizona electricity demand in megawatt-hours as reported by Arizona Public Service Company (AZPS). Left: original demand time series together with estimated trend and seasonality; right: residuals.
  • ...and 6 more figures

Theorems & Definitions (8)

  • Definition 2.1
  • Theorem 3.1: Sample efficiency
  • Remark 3.2
  • Remark 3.3
  • Definition 3.4
  • Remark 4.1
  • Theorem S1
  • Remark S1