Table of Contents
Fetching ...

Sharp Concentration Inequalities: Phase Transition and Mixing of Orlicz Tails with Variance

Yinan Shen, Jinchi Lv

Abstract

In this work, we investigate how to develop sharp concentration inequalities for sub-Weibull random variables, including sub-Gaussian and sub-exponential distributions. Although the random variables may not be sub-Guassian, the tail probability around the origin behaves as if they were sub-Gaussian, and the tail probability decays align with the Orlicz $Ψ_α$-tail elsewhere. Specifically, for independent and identically distributed (i.i.d.) $\{X_i\}_{i=1}^n$ with finite Orlicz norm $\|X\|_{Ψ_α}$, our theory unveils that there is an interesting phase transition at $α= 2$ in that $\PPł(ł|\sum_{i=1}^n X_i \r| \geq t\r)$ with $t > 0$ is upper bounded by $2\expł(-C\maxł\{\frac{t^2}{n\|X\|_{Ψ_α}^2},\frac{t^α}{ n^{α-1} \|X\|_{Ψ_α}^α}\r\}\r)$ for $α\geq 2$, and by $2\expł(-C\minł\{\frac{t^2}{n\|X\|_{Ψ_α}^2},\frac{t^α}{ n^{α-1} \|X\|_{Ψ_α}^α}\r\}\r)$ for $1\leq α\leq 2$ with some positive constant $C$. In many scenarios, it is often necessary to distinguish the standard deviation from the Orlicz norm when the latter can exceed the former greatly. To accommodate this, we build a new theoretical analysis framework, and our sharp, flexible concentration inequalities involve the variance and a mixing of Orlicz $Ψ_α$-tails through the min and max functions. Our theory yields new, improved concentration inequalities even for the cases of sub-Gaussian and sub-exponential distributions with $α= 2$ and $1$, respectively. We further demonstrate our theory on martingales, random vectors, random matrices, and covariance matrix estimation. These sharp concentration inequalities can empower more precise non-asymptotic analyses across different statistical and machine learning applications.

Sharp Concentration Inequalities: Phase Transition and Mixing of Orlicz Tails with Variance

Abstract

In this work, we investigate how to develop sharp concentration inequalities for sub-Weibull random variables, including sub-Gaussian and sub-exponential distributions. Although the random variables may not be sub-Guassian, the tail probability around the origin behaves as if they were sub-Gaussian, and the tail probability decays align with the Orlicz -tail elsewhere. Specifically, for independent and identically distributed (i.i.d.) with finite Orlicz norm , our theory unveils that there is an interesting phase transition at in that with is upper bounded by for , and by for with some positive constant . In many scenarios, it is often necessary to distinguish the standard deviation from the Orlicz norm when the latter can exceed the former greatly. To accommodate this, we build a new theoretical analysis framework, and our sharp, flexible concentration inequalities involve the variance and a mixing of Orlicz -tails through the min and max functions. Our theory yields new, improved concentration inequalities even for the cases of sub-Gaussian and sub-exponential distributions with and , respectively. We further demonstrate our theory on martingales, random vectors, random matrices, and covariance matrix estimation. These sharp concentration inequalities can empower more precise non-asymptotic analyses across different statistical and machine learning applications.

Paper Structure

This paper contains 32 sections, 25 theorems, 291 equations, 2 figures, 2 tables.

Key Result

Corollary 1

Assume that $X_1,\cdots,X_n$ are i.i.d. mean-zero real-valued random variables, and satisfy $\|X\|_{\Psi_{\alpha}}<\infty$ for some $\alpha\geq 1$. Then we have that where $C>0$ is some constant that does not depend on $\alpha,t,n,X$.

Figures (2)

  • Figure 1: The $y$-axis represents the bound on $-\log(\frac{1}{2}{\mathbb P}(|\sum_{i=1}^n X_i|\geq t))$. Figures \ref{['fig:kuchi_alpha<2']} and \ref{['fig:kuchi_alpha>2']} are from kuchibhotla2022moving. Figure \ref{['fig:alpha>2']} corresponds to Theorem \ref{['thm:conc_univariates']} or Theorem \ref{['thm:conc_univariate_SigmaL']} when $\sigma_X$ does not need to be distinguished from $\|X\|_{\Psi_\alpha}$. The bound in Figure \ref{['fig:alpha>2']}improves that in Figure \ref{['fig:kuchi_alpha>2']}.
  • Figure 2: The $y$-axis represents the bound on $-\log(\frac{1}{2}{\mathbb P}(|\sum_{i=1}^n X_i|\geq t))$. Figure \ref{['fig:koltchinskii']} plots the bound given by koltchinskii2011neumann, whose tail probability is sharp when $t\leq \frac{n\sigma_X^2}{\|X\|_{\Psi_\alpha}\log^{\frac{1}{\alpha}}\left(\frac{\|X\|_{\Psi_\alpha}}{\sigma_X}\right)}$. Figures \ref{['fig:sigmaL_alpha<2']} and \ref{['fig:sigmaL_alpha>2']} correspond to Theorem \ref{['thm:conc_univariate_SigmaL']} and Corollary \ref{['cor:norm_sigmaL']}. The bounds in Figures \ref{['fig:sigmaL_alpha<2']} and \ref{['fig:sigmaL_alpha>2']}improve those in Figures \ref{['fig:kuchi_alpha<2']}, \ref{['fig:kuchi_alpha>2']}, and \ref{['fig:koltchinskii']}.

Theorems & Definitions (41)

  • Definition 1: Orlicz $\|\cdot\|_{\Psi_\alpha}$-norm
  • Corollary 1: Concentration for i.i.d. univariate
  • Lemma 1: Moment generating function
  • Lemma 2: Lower bound on moment generating function
  • Theorem 1: Concentration inequalities
  • Theorem 2: Moment inequalities
  • Corollary 2: Bound on Orlicz norm
  • Remark 1: Characterization of concentration
  • Definition 2
  • Remark 2
  • ...and 31 more