Table of Contents
Fetching ...

Hypothesis Testing for Functional Linear Models via Bootstrapping

Yinan Lin, Zhenhua Lin

Abstract

Hypothesis testing for the slope function in functional linear regression is of both practical and theoretical interest. We develop a novel test for the nullity of the slope function, where testing the slope function is transformed into testing a high-dimensional vector based on functional principal component analysis. This transformation fully circumvents ill-posedness in functional linear regression, thereby enhancing numeric stability. The proposed method leverages the technique of bootstrapping max statistics and exploits the inherent variance decay property of functional data, improving the empirical power of tests especially when the sample size is limited or the signal is relatively weak. We establish validity and consistency of our proposed test when the functional principal components are derived from data. Moreover, we show that the test maintains its asymptotic validity and consistency, even when including \emph{all} empirical functional principal components in our test statistics. This sharply contrasts with the task of estimating the slope function, which requires a delicate choice of the number (at most in the order of $\sqrt n$) of functional principal components to ensure estimation consistency. This distinction highlights an interesting difference between estimation and statistical inference regarding the slope function in functional linear regression. To the best of our knowledge, the proposed test is the first of its kind to utilize all empirical functional principal components.

Hypothesis Testing for Functional Linear Models via Bootstrapping

Abstract

Hypothesis testing for the slope function in functional linear regression is of both practical and theoretical interest. We develop a novel test for the nullity of the slope function, where testing the slope function is transformed into testing a high-dimensional vector based on functional principal component analysis. This transformation fully circumvents ill-posedness in functional linear regression, thereby enhancing numeric stability. The proposed method leverages the technique of bootstrapping max statistics and exploits the inherent variance decay property of functional data, improving the empirical power of tests especially when the sample size is limited or the signal is relatively weak. We establish validity and consistency of our proposed test when the functional principal components are derived from data. Moreover, we show that the test maintains its asymptotic validity and consistency, even when including \emph{all} empirical functional principal components in our test statistics. This sharply contrasts with the task of estimating the slope function, which requires a delicate choice of the number (at most in the order of ) of functional principal components to ensure estimation consistency. This distinction highlights an interesting difference between estimation and statistical inference regarding the slope function in functional linear regression. To the best of our knowledge, the proposed test is the first of its kind to utilize all empirical functional principal components.

Paper Structure

This paper contains 6 sections, 7 theorems, 29 equations, 5 figures, 1 table.

Key Result

Proposition 2.1

$\nu_{j_1 j_2} = \mathbb{E} ( \langle X, \phi_{j_1} \rangle_1 \langle Y, \psi_{j_2} \rangle_2 )$ and $b_{j_1j_2} = \lambda_{j_1}^{-1} \nu_{j_1j_2}$.

Figures (5)

  • Figure 1: Empirical size ($r=0$) and power ($r>0$) of the proposed method (red-solid), the exponential scan method (blue-dashed) and the Fisher-type method (black-dotted) for the scalar-on-function family.
  • Figure 2: Empirical size ($r=0$) and power ($r>0$) of the proposed method (red-solid) and the chi-squared test (blue-dashed) for the function-on-function family.
  • Figure 3: Empirical size ($r=0$) and power ($r>0$) of the proposed method (red-solid) and the F-test (blue-dashed) for the function-on-vector family.
  • Figure 4: Mean activity profile curves among male children (left) and female children (right) for different age groups, namely, age 6-8 (red-solid), age 9-10 (blue-dashed), age 11-12 (black-dotted), 13-14 (purple-dash-dotted) and age 15-17 (green-dashed).
  • Figure 5: Mean activity profile curves among young female (top-left) and male (top-right) adults with their zoom-in regions (bottom) on the intensity spectrum $[300,1500]$ in different age groups, namely, age 18-21 (red-solid), age 22-25 (blue-dashed), age 26-29 (black-dotted), age 30-33 (purple-dash-dotted) and age 33-35 (green-dashed).

Theorems & Definitions (9)

  • Proposition 2.1
  • Proposition 3.4
  • Remark 1
  • Theorem 3.5: Uniform Gaussian approximation
  • Theorem 3.6: Uniform bootstrap approximation
  • Remark 2
  • Theorem 3.7
  • Theorem 3.8
  • Theorem 3.9