Table of Contents
Fetching ...

Strong Gaussian approximation for U-statistics in high dimensions and beyond

Weijia Li, Leheng Cai, Qirui Hu

Abstract

We establish a strong Gaussian approximation for high-dimensional non-degenerate U-statistics with diverging dimension. Under mild assumptions, we construct, on a sufficiently rich probability space, a Gaussian process that uniformly approximates the entire sequential U-statistic process. The approximation error is explicitly characterized and vanishes under polynomial growth of the dimension. The key technical contribution is a sharp martingale maximal inequality for completely degenerate U-statistics, combined with a high-dimensional strong approximation for independent sums. This coupling yields functional Gaussian limits without relying on $\mathcal{L}^\infty$-type bounds or bootstrap arguments. The theory is illustrated through three representative examples of U-statistics: the spatial Kendall's tau matrix, the multivariate Gini's mean difference, and the characteristic dispersion parameter. As applications, we derive Brownian bridge approximations for U-statistic-based change-point statistics and develop a self-normalized relevant testing procedure whose limiting distribution is fully pivotal. The framework naturally accommodates bounded kernels and therefore remains valid under heavy-tailed distributions. Overall, our results provide a unified probability-theoretic foundation for high-dimensional inference based on U-statistics.

Strong Gaussian approximation for U-statistics in high dimensions and beyond

Abstract

We establish a strong Gaussian approximation for high-dimensional non-degenerate U-statistics with diverging dimension. Under mild assumptions, we construct, on a sufficiently rich probability space, a Gaussian process that uniformly approximates the entire sequential U-statistic process. The approximation error is explicitly characterized and vanishes under polynomial growth of the dimension. The key technical contribution is a sharp martingale maximal inequality for completely degenerate U-statistics, combined with a high-dimensional strong approximation for independent sums. This coupling yields functional Gaussian limits without relying on -type bounds or bootstrap arguments. The theory is illustrated through three representative examples of U-statistics: the spatial Kendall's tau matrix, the multivariate Gini's mean difference, and the characteristic dispersion parameter. As applications, we derive Brownian bridge approximations for U-statistic-based change-point statistics and develop a self-normalized relevant testing procedure whose limiting distribution is fully pivotal. The framework naturally accommodates bounded kernels and therefore remains valid under heavy-tailed distributions. Overall, our results provide a unified probability-theoretic foundation for high-dimensional inference based on U-statistics.
Paper Structure (9 sections, 10 theorems, 40 equations)

This paper contains 9 sections, 10 theorems, 40 equations.

Key Result

Theorem 1

Under Assumptions (A1)-(A2), there exists a sequence of independent Gaussian random vectors $\{\bm Z_i\}_{i=1}^n$ with $\bm Z_i \sim \mathcal{N}(\bm{0}, \bm{\Sigma})$ such that, for the Gaussian partial sum process $\bm{W}_k = \sum_{i=1}^k \bm Z_i/\sqrt{n}$, Under (A3), the above approximation error vanishes asymptotically.

Theorems & Definitions (17)

  • Example 1: Multivariate Gini's Mean Difference
  • Example 2: Characteristic dispersion parameter
  • Example 3: Spatial Kendall's tau matrix
  • Theorem 1
  • Proposition 1
  • Lemma 2.1: Maximal Inequality for Degenerate U-statistics
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Example \ref{exa:GMD}: Multivariate Gini's Mean Difference (continued)
  • ...and 7 more