Table of Contents
Fetching ...

Partition-Based Functional Ridge Regression for High-Dimensional Data

Shaista Ashraf, Ismail Shah, Farrukh Javed

Abstract

This paper proposes a partition-based functional ridge regression framework to address multicollinearity, overfitting, and interpretability in high-dimensional functional linear models. The coefficient function vector \( \boldsymbolβ(s) \) is decomposed into two components, \( \boldsymbolβ_1(s) \) and \( \boldsymbolβ_2(s) \), representing dominant and weaker functional effects. This partition enables differential ridge penalization across functional blocks, so that important signals are preserved while less informative components are more strongly shrunk. The resulting approach improves numerical stability and enhances interpretability without relying on explicit variable selection. We develop three estimators: the Functional Ridge Estimator (FRE), the Functional Ridge Full Model (FRFM), and the Functional Ridge Sub-Model (FRSM). Under standard regularity conditions, we establish consistency and asymptotic normality for all estimators. Simulation results reveal a clear bias--variance trade-off where FRSM performs best in small samples through strong variance reduction, whereas FRFM achieves superior accuracy in moderate to large samples by retaining informative functional structure through adaptive penalization. An empirical application to Canadian weather data further demonstrates improved predictive performance, reduced variance inflation, and clearer identification of influential functional effects. Overall, partition-based ridge regularization provides a practical and theoretically grounded method for high-dimensional functional regression.

Partition-Based Functional Ridge Regression for High-Dimensional Data

Abstract

This paper proposes a partition-based functional ridge regression framework to address multicollinearity, overfitting, and interpretability in high-dimensional functional linear models. The coefficient function vector \( \boldsymbolβ(s) \) is decomposed into two components, \( \boldsymbolβ_1(s) \) and \( \boldsymbolβ_2(s) \), representing dominant and weaker functional effects. This partition enables differential ridge penalization across functional blocks, so that important signals are preserved while less informative components are more strongly shrunk. The resulting approach improves numerical stability and enhances interpretability without relying on explicit variable selection. We develop three estimators: the Functional Ridge Estimator (FRE), the Functional Ridge Full Model (FRFM), and the Functional Ridge Sub-Model (FRSM). Under standard regularity conditions, we establish consistency and asymptotic normality for all estimators. Simulation results reveal a clear bias--variance trade-off where FRSM performs best in small samples through strong variance reduction, whereas FRFM achieves superior accuracy in moderate to large samples by retaining informative functional structure through adaptive penalization. An empirical application to Canadian weather data further demonstrates improved predictive performance, reduced variance inflation, and clearer identification of influential functional effects. Overall, partition-based ridge regularization provides a practical and theoretically grounded method for high-dimensional functional regression.
Paper Structure (24 sections, 7 theorems, 101 equations, 5 figures, 7 tables)

This paper contains 24 sections, 7 theorems, 101 equations, 5 figures, 7 tables.

Key Result

Theorem 1

Under (A1)–(A7), choose $K_z \sim n^{1/(4s+1)}$ and $\lambda_1 \sim n^{-2s/(4s+1)}$. Then

Figures (5)

  • Figure 1: Pairwise correlations among temperature predictors across stations prior to regularization. Correlations consistently exceed $0.97$, indicating severe multicollinearity that motivates ridge-type regularization.
  • Figure 2: Generalized cross-validation scores as a function of the smoothing parameter $\lambda$ for FRE, FRFM, and FRSM. FRFM selects a smaller $\lambda_1$ for the temperature block, reflecting weaker shrinkage.
  • Figure 3: True versus estimated coefficient functions for temperature (top row) and precipitation (bottom row) under FRE, FRFM, and FRSM. Black curves denote the true coefficient functions $\beta^{(T)}_1(t)$ and $\beta^{(P)}_2(t)$, while colored curves show the corresponding estimates $\hat{\beta}^{(T)}_1(t)$ (blue) and $\hat{\beta}^{(P)}_2(t)$ (orange).
  • Figure 4: Overall station-wise contribution to explaining Montreal weather, measured by the integrated magnitude $\int_{\mathcal{T}} \|\hat{\beta}_j(t)\|^2 dt$ for temperature (blue) and precipitation (orange). Larger values indicate stronger functional influence on Montreal’s annual mean temperature.
  • Figure 5: True versus estimated coefficient functions for Montreal under each estimator. FRFM achieves the closest alignment, particularly for temperature. Gray curves correspond to non-local stations, while the Montreal station is highlighted.

Theorems & Definitions (14)

  • Theorem 1: Convergence rate of FRE
  • Theorem 2: Convergence rate of FRSM
  • Theorem 3: Convergence rates of FRFM
  • Theorem 4: Asymptotic normality
  • Remark 1
  • Lemma 1: Spline approximation
  • Lemma 2: Variance bound for functional ridge estimators
  • proof
  • proof
  • proof
  • ...and 4 more