Table of Contents
Fetching ...

Concentration Inequalities for Suprema of Empirical Processes with Dependent Data via Generic Chaining with Applications to Statistical Learning

Chiara Amorino, Christian Brownlees, Ankita Ghosh

TL;DR

The results show that empirical risk minimzaton with dependent data attains a prediction accuracy comparable to that in the i.i.d. setting for a wide range of nonlinear regression models.

Abstract

This paper develops a general concentration inequality for the suprema of empirical processes with dependent data. The concentration inequality is obtained by combining generic chaining with a coupling-based strategy. Our framework accommodates high-dimensional and heavy-tailed (sub-Weibull) data. We demonstrate the usefulness of our result by deriving non-asymptotic predictive performance guarantees for empirical risk minimization in regression problems with dependent data. In particular, we establish an oracle inequality for a broad class of nonlinear regression models and, as a special case, a single-layer neural network model. Our results show that empirical risk minimzaton with dependent data attains a prediction accuracy comparable to that in the i.i.d. setting for a wide range of nonlinear regression models.

Concentration Inequalities for Suprema of Empirical Processes with Dependent Data via Generic Chaining with Applications to Statistical Learning

TL;DR

The results show that empirical risk minimzaton with dependent data attains a prediction accuracy comparable to that in the i.i.d. setting for a wide range of nonlinear regression models.

Abstract

This paper develops a general concentration inequality for the suprema of empirical processes with dependent data. The concentration inequality is obtained by combining generic chaining with a coupling-based strategy. Our framework accommodates high-dimensional and heavy-tailed (sub-Weibull) data. We demonstrate the usefulness of our result by deriving non-asymptotic predictive performance guarantees for empirical risk minimization in regression problems with dependent data. In particular, we establish an oracle inequality for a broad class of nonlinear regression models and, as a special case, a single-layer neural network model. Our results show that empirical risk minimzaton with dependent data attains a prediction accuracy comparable to that in the i.i.d. setting for a wide range of nonlinear regression models.

Paper Structure

This paper contains 16 sections, 12 theorems, 120 equations.

Key Result

Theorem 2.1

Suppose asm:increments_and_tail and asm:coupling are satisfied. Then, for any $n \in \{ 1 , \ldots , T \}$, any $\varepsilon_1\geq 2$ and any $\varepsilon_2>0$ holds at most with probability where $C_\alpha$ is a positive constant that depends on $\alpha$.

Theorems & Definitions (23)

  • Theorem 2.1: Concentration
  • Proposition 3.1
  • Corollary 3.1
  • Proposition 4.1
  • Proposition 4.2
  • Proposition 4.3: Generic Chaining
  • Proposition 4.4
  • proof : Proof of Proposition \ref{['corollary:erm:reg']}
  • proof : Proof of Corollary \ref{['corollary:erm:neural_nets']}
  • proof : Proof of Proposition \ref{['prop:coupling']}
  • ...and 13 more