Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation

Zehao Li; Zhouchen Lin; Yijie Peng

Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation

Zehao Li, Zhouchen Lin, Yijie Peng

TL;DR

This work introduces a ratio-free nested multi-time-scale stochastic approximation (NMTS) framework to address parameter inference in likelihood-free settings, where both the likelihood and its gradient are estimated by simulation. By coupling a fast-timescale gradient-tracker with a slow-timescale parameter (or variational) update, NMTS eliminates ratio bias and achieves faster, more stable convergence than single-timescale methods; it also provides strong convergence, weak convergence, and $\mathbb{L}^1$ rate results, with an explicit rate $O\left(\frac{\beta_k}{\alpha_k}+\sqrt{\frac{\alpha_k}{N}}\right)$ and an optimal scheduling yielding $O(k^{-1/3})$ MAE under appropriate choices. The framework extends to variational posterior inference and neural-network-based likelihood/posterior estimation, including two-network architectures trained at different time scales, and demonstrates improvements of one to two orders of magnitude in estimation accuracy at fixed computational budgets. Theoretical guarantees are complemented by numerical experiments on MLE and PDE tasks, plus toy and real-world neural-network implementations, underscoring NMTS’s efficiency and scalability for stochastic simulators. Overall, NMTS advances likelihood-free inference by delivering bias-free gradient estimation, rigorous convergence properties, and practical performance gains in complex, high-variance simulation settings.

Abstract

We study parameter inference in simulation-based stochastic models where the analytical form of the likelihood is unknown. The main difficulty is that score evaluation as a ratio of noisy Monte Carlo estimators induces bias and instability, which we overcome with a ratio-free nested multi-time-scale (NMTS) stochastic approximation (SA) method that simultaneously tracks the score and drives the parameter update. We provide a comprehensive theoretical analysis of the proposed NMTS algorithm for solving likelihood-free inference problems, including strong convergence, asymptotic normality, and convergence rates. We show that our algorithm can eliminate the original asymptotic bias $O\big(\sqrt{\frac{1}{N}}\big)$ and accelerate the convergence rate from $O\big(β_k+\sqrt{\frac{1}{N}}\big)$ to $O\big(\frac{β_k}{α_k}+\sqrt{\frac{α_k}{N}}\big)$, where $N$ is the fixed batch size, $α_k$ and $β_k$ are decreasing step sizes with $α_k$, $β_k$, $β_k/α_k\rightarrow 0$. With proper choice of $α_k$ and $β_k$, our convergence rates can match the optimal rate in the multi-time-scale SA literature. Numerical experiments demonstrate that our algorithm can improve the estimation accuracy by one to two orders of magnitude at the same computational cost, making it efficient for parameter estimation in stochastic systems.

Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation

TL;DR

rate results, with an explicit rate

and an optimal scheduling yielding

MAE under appropriate choices. The framework extends to variational posterior inference and neural-network-based likelihood/posterior estimation, including two-network architectures trained at different time scales, and demonstrates improvements of one to two orders of magnitude in estimation accuracy at fixed computational budgets. Theoretical guarantees are complemented by numerical experiments on MLE and PDE tasks, plus toy and real-world neural-network implementations, underscoring NMTS’s efficiency and scalability for stochastic simulators. Overall, NMTS advances likelihood-free inference by delivering bias-free gradient estimation, rigorous convergence properties, and practical performance gains in complex, high-variance simulation settings.

Abstract

and accelerate the convergence rate from

, where

is the fixed batch size,

and

are decreasing step sizes with

. With proper choice of

and

, our convergence rates can match the optimal rate in the multi-time-scale SA literature. Numerical experiments demonstrate that our algorithm can improve the estimation accuracy by one to two orders of magnitude at the same computational cost, making it efficient for parameter estimation in stochastic systems.

Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation

TL;DR

Abstract

Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (28)