Table of Contents
Fetching ...

Adaptive monotonicity testing in sublinear time

Housen Li, Zhi Liu, Axel Munk

TL;DR

This work tackles monotonicity testing for nonparametric regression under Gaussian noise by introducing FOMT, a sparse, randomly sampled collection of local tests based on local polynomial estimation that achieves minimax separation rates over Hölder classes $\Sigma(\beta,L)$ for $\beta\in(0,2]$ while delivering sublinear runtime in many scenarios. The method leverages refined variance bounds that exploit correlation between nearby local estimators and employs a spot-check style sampling to maintain computational efficiency. It further develops an adaptive variant, CALM, and an adaptive testing procedure A-FOMT that retain optimal statistical performance with essentially the same computational cost as if the smoothness were known. Extensive simulations show FOMT and A-FOMT outperform or compete with existing minimax procedures in speed, while maintaining strong detection power, validating their practical viability for large-scale data analysis.

Abstract

Modern large-scale data analysis increasingly faces the challenge of achieving computational efficiency as well as statistical accuracy, as classical statistically efficient methods often fall short in the first regard. In the context of testing monotonicity of a regression function, we propose FOMT (Fast and Optimal Monotonicity Test), a novel methodology tailored to meet these dual demands. FOMT employs a sparse collection of local tests, strategically generated at random, to detect violations of monotonicity scattered throughout the domain of the regression function. This sparsity enables significant computational efficiency, achieving sublinear runtime in most cases, and quasilinear runtime (i.e., linear up to a log factor) in the worst case. In contrast, existing statistically optimal tests typically require at least quadratic runtime. FOMT's statistical accuracy is achieved through the precise calibration of these local tests and their effective combination, ensuring both sensitivity to violations and control over false positives. More precisely, we show that FOMT separates the null and alternative hypotheses at minimax optimal rates over Hölder function classes of smoothness order in $(0,2]$. Further, when the smoothness is unknown, we introduce an adaptive version of FOMT, based on a modified Lepskii principle, which attains statistical optimality and meanwhile maintains the same computational complexity as if the intrinsic smoothness were known. Extensive simulations confirm the competitiveness and effectiveness of both FOMT and its adaptive variant.

Adaptive monotonicity testing in sublinear time

TL;DR

This work tackles monotonicity testing for nonparametric regression under Gaussian noise by introducing FOMT, a sparse, randomly sampled collection of local tests based on local polynomial estimation that achieves minimax separation rates over Hölder classes for while delivering sublinear runtime in many scenarios. The method leverages refined variance bounds that exploit correlation between nearby local estimators and employs a spot-check style sampling to maintain computational efficiency. It further develops an adaptive variant, CALM, and an adaptive testing procedure A-FOMT that retain optimal statistical performance with essentially the same computational cost as if the smoothness were known. Extensive simulations show FOMT and A-FOMT outperform or compete with existing minimax procedures in speed, while maintaining strong detection power, validating their practical viability for large-scale data analysis.

Abstract

Modern large-scale data analysis increasingly faces the challenge of achieving computational efficiency as well as statistical accuracy, as classical statistically efficient methods often fall short in the first regard. In the context of testing monotonicity of a regression function, we propose FOMT (Fast and Optimal Monotonicity Test), a novel methodology tailored to meet these dual demands. FOMT employs a sparse collection of local tests, strategically generated at random, to detect violations of monotonicity scattered throughout the domain of the regression function. This sparsity enables significant computational efficiency, achieving sublinear runtime in most cases, and quasilinear runtime (i.e., linear up to a log factor) in the worst case. In contrast, existing statistically optimal tests typically require at least quadratic runtime. FOMT's statistical accuracy is achieved through the precise calibration of these local tests and their effective combination, ensuring both sensitivity to violations and control over false positives. More precisely, we show that FOMT separates the null and alternative hypotheses at minimax optimal rates over Hölder function classes of smoothness order in . Further, when the smoothness is unknown, we introduce an adaptive version of FOMT, based on a modified Lepskii principle, which attains statistical optimality and meanwhile maintains the same computational complexity as if the intrinsic smoothness were known. Extensive simulations confirm the competitiveness and effectiveness of both FOMT and its adaptive variant.

Paper Structure

This paper contains 24 sections, 31 theorems, 248 equations, 7 figures, 6 tables, 5 algorithms.

Key Result

Theorem 2.2

Under the nonparametric regression model in model, suppose that Assumptions M1 and K1--K3 hold. Let $H$ be defined in eq:null and $\alpha\in (0,1)$. Then, the FOMT $\Phi$ in alg:MC is an $\alpha$-level test for $H$, namely,

Figures (7)

  • Figure 1: Phase diagram of FOMT in computational and statistical efficiency for the Hölder class $\Sigma(\beta, L)$ with $\beta \in (0,2]$. Here $h_n \asymp (\log(n)/n)^{1/(2\beta+1)}$ is the optimal bandwidth of the underlying local polynomial estimators, and $\gamma_n \asymp (\log(n)/n)^{(\beta + 1-\lceil\beta\rceil)/(2\beta+1)}$ the minimax estimation error in $L^{\infty}$-norm. The shaded (green) region shows all possible relations between the $\gamma$-exceedance fraction $\varepsilon_{\lceil\beta\rceil-1,\gamma_n}(f)$ and the discrepancy from the null hypothesis (i.e. monotone increasing functions). The discrepancy is measured by $\max \{f(a) - f(b)\;|\; 0 \le a \le b\le 1\}$ if $\beta \in (0,1]$, and by $\max(\{-f'(x)\;|\;0\le x \le 1\} \cup \{0\})$ if $\beta \in (1,2]$. This phase diagram not only delineates the phase transition in statistical detectability of alternatives, but also highlights computational complexity (indicated by varying degrees of darkness) interpolating between $O\bigl(n^{\frac{2\beta}{2\beta+1}} \left(\log n\right)^{\frac{4\beta+3}{2\beta+1}}\bigr)$ and $O\bigl(n(\log n)^2\bigr)$. See Section \ref{['SS: CA of Phi']} for further details.
  • Figure 2: Illustration of FOMT (\ref{['alg:MC']}). The true signal $f(x)= x^2-x^7$ (blue dashed line) and its LPE $\hat{f}_n$ (red solid line) together with $n = 100$ observations $Y_i$ (gray circles) are displayed in both panels with $x_I = 34/100$ and $x_I= 75/100$, respectively. For each $x_I$, locations of $x_J$ in an individual repetition of left and right searches, are highlighted by black crosses, respectively. In the first panel, no violation is detected with $I = 34$, while in the second panel, a violation with $I = 75$ and $J=93$ (marked with a red cross) is detected.
  • Figure 3: Squared biases versus variances of LPE over various choices of bandwidth, see \ref{['ieq:BV trade-off']}. We illustrate two scenarios: $f_1\in \Sigma(\beta,L)$ with $\beta>2$, and $f_2\in \mathcal{C}_{\mathcal{A},\beta}$ in \ref{['e: defn class C']} with $0<\beta\le2$. In case of $f_1$, CALM (\ref{['Lepskii']}) returns $h = h_M$ while in the case of $f_2$, CALM returns $h_{\Bar{m}}\lesssim \bigl(\log(n)/n\bigr)^{1/(2\beta+1)}$
  • Figure 4: Test signals $f_1(x) = \,1+x-0.45 \cdot \exp{(-50(x-0.5)^2)}$, $f_2(x) = \,-0.2\cdot \exp{(-50(x-0.5)^2)}$, $f_3(x) = \,-0.3 x$ and $f_4(x) = \,x(1-x)$. The function $f_1$ is a specific case in gijbels2000tests with $a = 0.45$, while $f_2$ is considered by baraud2005testing. The function $f_3$ decreases linearly with slope $-0.3$. The function $f_4$ is a "arch-like" function.
  • Figure 5: Detection powers under the alternatives $f_i$ with $i = 1, \dots, 4$ of FOMT, A-FOMT, DS dumbgen2001multiscale, ABD akakpo2014testing and C chetverikov2019testing, averaged over $100$ repetitions, for various sample sizes.
  • ...and 2 more figures

Theorems & Definitions (73)

  • Definition 1.1: $\gamma$-exceedance fraction
  • Definition 2.1
  • Theorem 2.2: Type I error
  • Theorem 2.3: Minimax optimality
  • Corollary 2.4
  • Remark 2.5
  • Theorem 3.1: Computation
  • Remark 3.2: Log-factor speed-up for $1 < \beta \le 2$
  • Theorem 4.1
  • Remark 4.2
  • ...and 63 more