Table of Contents
Fetching ...

Nonlinear functional regression by functional deep neural network with kernel embedding

Zhongjie Shi, Jun Fan, Linhao Song, Ding-Xuan Zhou, Johan A. K. Suykens

TL;DR

This work tackles nonlinear functional regression with infinite-dimensional inputs by proposing a discretization-invariant functional deep neural network (KEFNN) that couples kernel embedding with a data-driven projection and a deep ReLU predictor. The method leverages $L_K f = \int_\Omega K(\cdot,t) f(t)\,dt$ and projection onto the first $d_1$ eigenfunctions of a projection kernel to obtain finite-dimensional representations, which are then fed into a deep network to predict the response. The authors establish explicit approximation rates for nonlinear functionals on Besov spaces, Gaussian RKHSs, and mixed smooth Sobolev spaces, showing that data-driven kernels can exploit input regularity to improve accuracy. A novel two-stage oracle inequality is derived to analyze ERM generalization in this discretized setting, yielding learning rates under Sobolev, Gaussian RKHS, and mixed Sobolev spaces, and they validate the approach with extensive simulations and real-data experiments. KEFNN demonstrates discretization invariance, robustness to noise, and competitive performance against baseline dimension-reduction strategies across diverse function spaces and datasets.

Abstract

Recently, deep learning has been widely applied in functional data analysis (FDA) with notable empirical success. However, the infinite dimensionality of functional data necessitates an effective dimension reduction approach for functional learning tasks, particularly in nonlinear functional regression. In this paper, we introduce a functional deep neural network with an adaptive and discretization-invariant dimension reduction method. Our functional network architecture consists of three parts: first, a kernel embedding step that features an integral transformation with an adaptive smooth kernel; next, a projection step that utilizes eigenfunction bases based on a projection Mercer kernel for the dimension reduction; and finally, a deep ReLU neural network is employed for the prediction. Explicit rates of approximating nonlinear smooth functionals across various input function spaces by our proposed functional network are derived. Additionally, we conduct a generalization analysis for the empirical risk minimization (ERM) algorithm applied to our functional net, by employing a novel two-stage oracle inequality and the established functional approximation results. Ultimately, we conduct numerical experiments on both simulated and real datasets to demonstrate the effectiveness and benefits of our functional net.

Nonlinear functional regression by functional deep neural network with kernel embedding

TL;DR

This work tackles nonlinear functional regression with infinite-dimensional inputs by proposing a discretization-invariant functional deep neural network (KEFNN) that couples kernel embedding with a data-driven projection and a deep ReLU predictor. The method leverages and projection onto the first eigenfunctions of a projection kernel to obtain finite-dimensional representations, which are then fed into a deep network to predict the response. The authors establish explicit approximation rates for nonlinear functionals on Besov spaces, Gaussian RKHSs, and mixed smooth Sobolev spaces, showing that data-driven kernels can exploit input regularity to improve accuracy. A novel two-stage oracle inequality is derived to analyze ERM generalization in this discretized setting, yielding learning rates under Sobolev, Gaussian RKHS, and mixed Sobolev spaces, and they validate the approach with extensive simulations and real-data experiments. KEFNN demonstrates discretization invariance, robustness to noise, and competitive performance against baseline dimension-reduction strategies across diverse function spaces and datasets.

Abstract

Recently, deep learning has been widely applied in functional data analysis (FDA) with notable empirical success. However, the infinite dimensionality of functional data necessitates an effective dimension reduction approach for functional learning tasks, particularly in nonlinear functional regression. In this paper, we introduce a functional deep neural network with an adaptive and discretization-invariant dimension reduction method. Our functional network architecture consists of three parts: first, a kernel embedding step that features an integral transformation with an adaptive smooth kernel; next, a projection step that utilizes eigenfunction bases based on a projection Mercer kernel for the dimension reduction; and finally, a deep ReLU neural network is employed for the prediction. Explicit rates of approximating nonlinear smooth functionals across various input function spaces by our proposed functional network are derived. Additionally, we conduct a generalization analysis for the empirical risk minimization (ERM) algorithm applied to our functional net, by employing a novel two-stage oracle inequality and the established functional approximation results. Ultimately, we conduct numerical experiments on both simulated and real datasets to demonstrate the effectiveness and benefits of our functional net.
Paper Structure (27 sections, 16 theorems, 142 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 27 sections, 16 theorems, 142 equations, 3 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Let $\alpha>0$, $d,M \in \mathbb N$. Assume that the input function space $\mathcal{F}$ is a compact subset of $L_\infty(\mathbb R^d) \cap B_{2,\infty}^\alpha(\mathbb R^d)$, with the condition that $\|f\|_{B_{2,\infty}^\alpha(\mathbb R^d) }\leq 1$ for any function $f\in \mathcal{F}$. Additionally, s there exists a functional network $F_{NN}$ that follows the architecture specified in Definition de

Figures (3)

  • Figure 1: Architecture of functional net with kernel embedding.
  • Figure 2: The influence of second-stage sample size $n$, depth of KEFNN, first-stage sample size $m$, variances of Gaussian noises in observations of input functions and responses on the test-set MSE.
  • Figure 3: The test-set MSE vs. number of second-stage samples (n) in logarithmic scale.

Theorems & Definitions (20)

  • Definition 1: Functional net with kernel embedding
  • Example 1
  • Example 2
  • Remark 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Proposition 1
  • Proposition 2
  • ...and 10 more