Table of Contents
Fetching ...

Computable Bounds for Strong Approximations with Applications

Haoyu Ye, Morgane Austern

TL;DR

The paper delivers a practically computable strong-approximation bound of KMT type for bounded i.i.d. variables, yielding thresholds that depend only on the range $R$ and standard deviation $\sigma$, and extends to unknown variance via an empirical variant. Using an inductive Stein-method construction and conditional Wasserstein bounds, it achieves a nonasymptotic coupling between partial sums $S_k$ and a Gaussian vector with the same covariance, with a uniform control $ ext{P}( orall k ext{≤}n: |S_k-Z_k|oldsymbol{ abla}_k) le ext{α}$ and thresholds growing as $O( ext{log} obreak n( ext{log} obreak n- ext{log} obreak ext{α}))$. The results enable time-uniform online change-point detection and nonasymptotic first-hitting-time bounds for random walks with drift, and they include an empirical adaptation for unknown variance. In addition, a Wasserstein-$p$ bound for sequences sampled without replacement yields a moderate deviation bound as a byproduct. While the asymptotic rate is suboptimal by a logarithmic factor, the constants are explicit and depend only on $R$ and $\sigma$, making the bounds highly actionable for finite-sample use and broad applications.

Abstract

The Komlós$\unicode{x2013}$Major$\unicode{x2013}$Tusnády (KMT) inequality for partial sums is one of the most celebrated results in probability theory. Yet its practical application has been hindered by a lack of practical constants. This paper addresses this limitation for bounded i.i.d. random variables. At the cost of an additional logarithmic factor, we propose a computable version of the KMT inequality that depends only on the variables' range and standard deviation. We also derive an empirical version of the inequality that achieves nominal coverage even when the standard deviation is unknown. We then demonstrate the practicality of our bounds through applications to online change point detection and first hitting time probabilities. As a byproduct of our analysis, we obtain a Cramér-type moderate deviation bound for normalized centered partial sums.

Computable Bounds for Strong Approximations with Applications

TL;DR

The paper delivers a practically computable strong-approximation bound of KMT type for bounded i.i.d. variables, yielding thresholds that depend only on the range and standard deviation , and extends to unknown variance via an empirical variant. Using an inductive Stein-method construction and conditional Wasserstein bounds, it achieves a nonasymptotic coupling between partial sums and a Gaussian vector with the same covariance, with a uniform control and thresholds growing as . The results enable time-uniform online change-point detection and nonasymptotic first-hitting-time bounds for random walks with drift, and they include an empirical adaptation for unknown variance. In addition, a Wasserstein- bound for sequences sampled without replacement yields a moderate deviation bound as a byproduct. While the asymptotic rate is suboptimal by a logarithmic factor, the constants are explicit and depend only on and , making the bounds highly actionable for finite-sample use and broad applications.

Abstract

The KomlósMajorTusnády (KMT) inequality for partial sums is one of the most celebrated results in probability theory. Yet its practical application has been hindered by a lack of practical constants. This paper addresses this limitation for bounded i.i.d. random variables. At the cost of an additional logarithmic factor, we propose a computable version of the KMT inequality that depends only on the variables' range and standard deviation. We also derive an empirical version of the inequality that achieves nominal coverage even when the standard deviation is unknown. We then demonstrate the practicality of our bounds through applications to online change point detection and first hitting time probabilities. As a byproduct of our analysis, we obtain a Cramér-type moderate deviation bound for normalized centered partial sums.

Paper Structure

This paper contains 42 sections, 56 theorems, 382 equations, 3 figures, 6 algorithms.

Key Result

Lemma 2.1

Let $(Y_i)_{i\ge 1}$ be generated according to Assumption ($R,\sigma$). Let $(\delta_k)_{k\le n}$ be a sequence of positive reals satisfying $\delta_k=\delta_{n-k}$ for all $k>n/2$. Suppose that $n$ is even and that one can construct $(\tilde{Z}_k^1)$ satisfying coupled with $\mathcal{I}=\Iintv*{1,\ where $\omega_p^R(n, \sigma)$ is defined in thm:main1.

Figures (3)

  • Figure 1: Comparison between our bound and bounds in \ref{['eqn:comparison1']}, \ref{['eqn:comparison2']}. Each panel shows $\alpha=0.1$ for sample sizes $n = 2^L$ with $L \in \{1, \dots, 14\}$ with different ranges $R=2,4,10$. The bound \ref{['eqn:comparison1']} (x-marked lines), derived from Bhattacharjee16, are for random variables $X_i$'s sampled i.i.d. from the symmetric finite set $\mathcal{A}_N := \{-R/2, -R/2 + R/(N-1), \dots, R/2\}$ for some $N > R$. We choose the distribution so that $X_i$ has mean zero, unit variance, and zero third moment. The dashed red line corresponds to the bound in \ref{['eqn:comparison2']}, derived from castelle1998strong. The solid black line shows our bound.
  • Figure 2: Detection rate comparison across mean shifts, with each $Y_i$ an average of $\ell$ independent $\textnormal{Uniform}[0,1]$ random variables. A change point occurs at $T = 2000$, with a post-change mean increased by the specified shift.
  • Figure 3: Minimum value of $N$ for which the upper bound in \ref{['grr']} becomes non-trivial.

Theorems & Definitions (103)

  • Definition 1: Wasserstein Distance
  • Definition 2: Conditional Wasserstein Distance
  • Lemma 2.1
  • proof
  • Theorem 2.2
  • Corollary 2.2.1
  • Theorem 2.3
  • Theorem 2.4
  • Remark 2.5
  • Lemma 2.6
  • ...and 93 more