Table of Contents
Fetching ...

Second Maximum of a Gaussian Random Field and Exact (t-)Spacing test

Jean-Marc Azaïs, Federico Dalmao, Yohann De Castro

TL;DR

This work develops an exact, non-asymptotic inference framework for the mean of Gaussian random fields on Riemannian manifolds by exploiting the second maximum and an ad-hoc Kac–Rice formula. It introduces the spacing test (and a Studentized variant for unknown variance) based on the conditional distribution of the global maximum given the second maximum and the independent Hessian component, yielding uniform null p-values in finite samples. The approach is instantiated through a general continuous sparse kernel regression framework and linked to the LARS path, with concrete validation in Spiked Tensor PCA and related tensor models. The methodology provides exact calibration, clear interpretation via the spacing between top maxima, and broad applicability to Gaussian fields on manifolds, including practical variance estimation via Karhunen–Loève expansion.

Abstract

In this article, we introduce the novel concept of the second maximum of a Gaussian random field on a Riemannian submanifold. This second maximum serves as a powerful tool for characterizing the distribution of the maximum. By utilizing an ad-hoc Kac Rice formula, we derive the explicit form of the maximum's distribution, conditioned on the second maximum and some regressed component of the Riemannian Hessian. This approach results in an exact test, based on the evaluation of spacing between these maxima, which we refer to as the spacing test. We investigate the applicability of this test in detecting sparse alternatives within Gaussian symmetric tensors, continuous sparse deconvolution, and two-layered neural networks with smooth rectifiers. Our theoretical results are supported by numerical experiments, which illustrate the calibration and power of the proposed tests. More generally, this test can be applied to any Gaussian random field on a Riemannian manifold, and we provide a general framework for the application of the spacing test in continuous sparse kernel regression. Furthermore, when the variance-covariance function of the Gaussian random field is known up to a scaling factor, we derive an exact Studentized version of our test, coined the $t$-spacing test. This test is perfectly calibrated under the null hypothesis and has high power for detecting sparse alternatives.

Second Maximum of a Gaussian Random Field and Exact (t-)Spacing test

TL;DR

This work develops an exact, non-asymptotic inference framework for the mean of Gaussian random fields on Riemannian manifolds by exploiting the second maximum and an ad-hoc Kac–Rice formula. It introduces the spacing test (and a Studentized variant for unknown variance) based on the conditional distribution of the global maximum given the second maximum and the independent Hessian component, yielding uniform null p-values in finite samples. The approach is instantiated through a general continuous sparse kernel regression framework and linked to the LARS path, with concrete validation in Spiked Tensor PCA and related tensor models. The methodology provides exact calibration, clear interpretation via the spacing between top maxima, and broad applicability to Gaussian fields on manifolds, including practical variance estimation via Karhunen–Loève expansion.

Abstract

In this article, we introduce the novel concept of the second maximum of a Gaussian random field on a Riemannian submanifold. This second maximum serves as a powerful tool for characterizing the distribution of the maximum. By utilizing an ad-hoc Kac Rice formula, we derive the explicit form of the maximum's distribution, conditioned on the second maximum and some regressed component of the Riemannian Hessian. This approach results in an exact test, based on the evaluation of spacing between these maxima, which we refer to as the spacing test. We investigate the applicability of this test in detecting sparse alternatives within Gaussian symmetric tensors, continuous sparse deconvolution, and two-layered neural networks with smooth rectifiers. Our theoretical results are supported by numerical experiments, which illustrate the calibration and power of the proposed tests. More generally, this test can be applied to any Gaussian random field on a Riemannian manifold, and we provide a general framework for the application of the spacing test in continuous sparse kernel regression. Furthermore, when the variance-covariance function of the Gaussian random field is known up to a scaling factor, we derive an exact Studentized version of our test, coined the -spacing test. This test is perfectly calibrated under the null hypothesis and has high power for detecting sparse alternatives.

Paper Structure

This paper contains 34 sections, 11 theorems, 123 equations, 6 figures, 2 tables.

Key Result

Lemma 1

Under Assumption $\bf (A_{1\text{-}4})$, one has

Figures (6)

  • Figure 1: [Spiked tensor PCA, Section \ref{['sec:spike_tensor_model']}, example 1/4] Visualisation of the random fields on the sphere $\mathds S^2$ for the Spiked Tensor PCA problem. Left: The Gaussian homogeneous polynomial $X(\cdot)$, where the arrow indicates the global maximizer $t_1$ (first eigenvector). Middle: The conditional random field $X^{|t_1}(\cdot)$ defined in \ref{['e:Xbarra_intro']}, where the arrow indicates the second maximizer $t_2$. Right: A volumetric view of $X^{|t_1}(\cdot)$ where the radial height represents the field value.
  • Figure 2: [Spiked tensor PCA, Section \ref{['sec:spike_tensor_model']}, example 2/4] The CDFs of the $p$-value of ($t$-)spacing tests over $250,000$ Monte-Carlo samples for each value of $\gamma$. Note that these tests are perfectly calibrated under the null hypothesis, for which $\gamma=0$. The parameter $\gamma$ is a scaling factor of eigenvalue of the rank-one tensor to be detected. The value $\gamma=1$ corresponds to the so-called phase transition in Spiked tensor PCA as presented in perry2020statistical. In the $t$-spacing test, the variance $\sigma$ has been estimated on $X^{|t_1}(\cdot)$.
  • Figure 3: [Spiked tensor PCA, Section \ref{['sec:spike_tensor_model']}, example 3/4] The PDFs of the maxima $\lambda_1,\lambda_2$ and the CDF of the distance $\mathrm{d}(t_0,t_1)$ over $250,000$ Monte-Carlo samples for each value of the parameter $\gamma$. The alternative is given by $t_0$ fixed and $\lambda_0=\gamma \times\sigma \sqrt{3\log 3+3\log\log 3}\simeq 0.684 \gamma\sigma$ where $\gamma=1$ corresponds to the so-called phase transition in Spiked tensor PCA as presented in perry2020statistical. The distance $\mathrm{d}(t_0,t_1)$ is normalized so that it is uniformly distributed on $(0,1)$ if $t_1$ is uniformly distributed on the sphere (e.g., $\gamma=0$).
  • Figure 4: [Spiked tensor PCA, Section \ref{['sec:spike_tensor_model']}, example 4/4] The variance estimator is not distributed according to $\chi^2$-distribution with $\kappa=7$ degrees of freedom, it underestimates the variance. The dashed black line is the distribution of a $\chi(7)/\sqrt{7}$, and $\sigma=1$ in these experiments. The probability distribution function is estimated over $250,000$ Monte-Carlo samples for each value of $\gamma$.
  • Figure 5: [Super-Resolution, Section \ref{['sec:SR']}] The super-resolution random field $Z(\cdot)$ depicts the observation of a point source at location $x_0$ on the interval $[0,2\pi)$ (spatial domain on the $x$-axis) with phase $\theta_0$ given on the $y$-axis. The $z$-axis corresponds to value of the profile likelihood $Z(x,\theta)$ at point $x$ with phase $\theta$. This figure is presented in azais2020testing.
  • ...and 1 more figures

Theorems & Definitions (24)

  • Remark 1: Spiked tensor PCA, Section \ref{['sec:spike_tensor_model']}
  • Remark 2
  • Lemma 1
  • proof
  • Lemma 2
  • Lemma 3
  • Remark 3
  • Lemma 4
  • Theorem 1
  • Theorem 2
  • ...and 14 more