Table of Contents
Fetching ...

Towards black-box parameter estimation

Amanda Lenzi, Haavard Rue

TL;DR

This work tackles parameter estimation when likelihoods are intractable but simulation is feasible by introducing a black-box, neural-network-based framework. It combines a sequential training procedure that steers simulations toward the region of high data density with a fully pre-trained database strategy to enable rapid estimation for time-series of varying lengths. The methods are demonstrated on Gaussian i.i.d. problems, spatial extremes (Brown-Resnick max-stable processes), and non-Gaussian stochastic volatility models, showing substantial reductions in bias and competitive uncertainty quantification relative to traditional approaches. The approaches enable automatic, flexible parameter inference in complex models with spatial-temporal dependencies, representing a step toward a general black-box inference framework that can scale to larger problems. The proposed combination of iterative refinement, transformed parameterization, and time-series pre-training offers practical efficiency gains for real-time and large-scale applications.

Abstract

Deep learning algorithms have recently shown to be a successful tool in estimating parameters of statistical models for which simulation is easy, but likelihood computation is challenging. But the success of these approaches depends on simulating parameters that sufficiently reproduce the observed data, and, at present, there is a lack of efficient methods to produce these simulations. We develop new black-box procedures to estimate parameters of statistical models based only on weak parameter structure assumptions. For well-structured likelihoods with frequent occurrences, such as in time series, this is achieved by pre-training a deep neural network on an extensive simulated database that covers a wide range of data sizes. For other types of complex dependencies, an iterative algorithm guides simulations to the correct parameter region in multiple rounds. These approaches can successfully estimate and quantify the uncertainty of parameters from non-Gaussian models with complex spatial and temporal dependencies. The success of our methods is a first step towards a fully flexible automatic black-box estimation framework.

Towards black-box parameter estimation

TL;DR

This work tackles parameter estimation when likelihoods are intractable but simulation is feasible by introducing a black-box, neural-network-based framework. It combines a sequential training procedure that steers simulations toward the region of high data density with a fully pre-trained database strategy to enable rapid estimation for time-series of varying lengths. The methods are demonstrated on Gaussian i.i.d. problems, spatial extremes (Brown-Resnick max-stable processes), and non-Gaussian stochastic volatility models, showing substantial reductions in bias and competitive uncertainty quantification relative to traditional approaches. The approaches enable automatic, flexible parameter inference in complex models with spatial-temporal dependencies, representing a step toward a general black-box inference framework that can scale to larger problems. The proposed combination of iterative refinement, transformed parameterization, and time-series pre-training offers practical efficiency gains for real-time and large-scale applications.

Abstract

Deep learning algorithms have recently shown to be a successful tool in estimating parameters of statistical models for which simulation is easy, but likelihood computation is challenging. But the success of these approaches depends on simulating parameters that sufficiently reproduce the observed data, and, at present, there is a lack of efficient methods to produce these simulations. We develop new black-box procedures to estimate parameters of statistical models based only on weak parameter structure assumptions. For well-structured likelihoods with frequent occurrences, such as in time series, this is achieved by pre-training a deep neural network on an extensive simulated database that covers a wide range of data sizes. For other types of complex dependencies, an iterative algorithm guides simulations to the correct parameter region in multiple rounds. These approaches can successfully estimate and quantify the uncertainty of parameters from non-Gaussian models with complex spatial and temporal dependencies. The success of our methods is a first step towards a fully flexible automatic black-box estimation framework.
Paper Structure (25 sections, 9 equations, 7 figures, 1 table, 3 algorithms)

This paper contains 25 sections, 9 equations, 7 figures, 1 table, 3 algorithms.

Figures (7)

  • Figure 1: Boxplots of training data (grey boxes), fitted values (red line), and bootstrapped samples (blue boxes) as iterations progress for estimating $\hbox{log}(\sigma_0^2) = 1$ from a zero-mean Gaussian distributed sample of size $J=20$ using Algorithm \ref{['alg2']}. The horizontal dashed line corresponds to the MLE. Training datasets were initially simulated in the uniform interval, with $N=10000$ samples.
  • Figure 2: Boxplots of the training data (grey) and bootstrap uncertainty (blue) at different iterations of Algorithm \ref{['alg2']}. Points in the red line are the fitted values from the MLP and the green dashed lines are the MLEs. Training output data were initially simulated using $N = 10000$ training samples each of length $J=20$ as: (a) $\mu_n$ and log($\sigma_n^2$), (b) $\mu_n$ and $\sigma_n^2$, (c) $\mu_n^2$ and log($\mu_n^2 + \sigma_n^2$), and (d) $\mu_n^2$ and $\mu_n^2 + \sigma_n^2$.
  • Figure 3: Scatterplots of estimated parameters on the transformed scales. Each dot/cross shows 100 independent estimates from the Brown-Resnick model using the CNN (green) or PL (red). The $\times$ is the actual value. Training datasets were initially simulated using $N = 6000$ training samples on a $[0,30]^2$ and based on estimates from fitting a powered exponential covariance function to the data.
  • Figure 4: Boxplots of the training data (grey) and bootstrap uncertainty (blue) at different iterations of Algorithm \ref{['alg3']} for $\theta_1$ (left) and $\theta_2$ (right). The top/bottom row shows an example of when the variogram estimates are outside/inside the initial training data. Points in the red line are the fitted values from the CNN, and the green dashed lines are the truth.
  • Figure 5: Data from an autoregressive process of order one (AR(1)) of length $T=50$ with AR coefficient equal to 0.9, augmented five times to achieve the training data length $T_k = 250$.
  • ...and 2 more figures