Table of Contents
Fetching ...

Generalized Sparse Additive Model with Unknown Link Function

Peipei Yuan, Xinge You, Hong Chen, Xuelin Zhang, Qinmu Peng

TL;DR

This work introduces GSAMUL, a generalized sparse additive model with an unknown link function that jointly estimates additive components via a B-spline basis and the link function via a multilayer perceptron, while enforcing sparsity with an $\ell_{2,1}$ penalty. A bilevel optimization scheme with training/validation splits enables simultaneous function estimation and variable selection, and a stability-based procedure identifies informative variables. The authors prove convergence to critical points under standard assumptions and demonstrate superior performance on synthetic and real data, particularly in detecting hidden interactions and excluding irrelevant features. Collectively, GSAMUL offers an interpretable, scalable approach for high-dimensional problems where the link function is not known a priori, improving both predictive accuracy and variable selection.

Abstract

Generalized additive models (GAM) have been successfully applied to high dimensional data analysis. However, most existing methods cannot simultaneously estimate the link function, the component functions and the variable interaction. To alleviate this problem, we propose a new sparse additive model, named generalized sparse additive model with unknown link function (GSAMUL), in which the component functions are estimated by B-spline basis and the unknown link function is estimated by a multi-layer perceptron (MLP) network. Furthermore, $\ell_{2,1}$-norm regularizer is used for variable selection. The proposed GSAMUL can realize both variable selection and hidden interaction. We integrate this estimation into a bilevel optimization problem, where the data is split into training set and validation set. In theory, we provide the guarantees about the convergence of the approximate procedure. In applications, experimental evaluations on both synthetic and real world data sets consistently validate the effectiveness of the proposed approach.

Generalized Sparse Additive Model with Unknown Link Function

TL;DR

This work introduces GSAMUL, a generalized sparse additive model with an unknown link function that jointly estimates additive components via a B-spline basis and the link function via a multilayer perceptron, while enforcing sparsity with an penalty. A bilevel optimization scheme with training/validation splits enables simultaneous function estimation and variable selection, and a stability-based procedure identifies informative variables. The authors prove convergence to critical points under standard assumptions and demonstrate superior performance on synthetic and real data, particularly in detecting hidden interactions and excluding irrelevant features. Collectively, GSAMUL offers an interpretable, scalable approach for high-dimensional problems where the link function is not known a priori, improving both predictive accuracy and variable selection.

Abstract

Generalized additive models (GAM) have been successfully applied to high dimensional data analysis. However, most existing methods cannot simultaneously estimate the link function, the component functions and the variable interaction. To alleviate this problem, we propose a new sparse additive model, named generalized sparse additive model with unknown link function (GSAMUL), in which the component functions are estimated by B-spline basis and the unknown link function is estimated by a multi-layer perceptron (MLP) network. Furthermore, -norm regularizer is used for variable selection. The proposed GSAMUL can realize both variable selection and hidden interaction. We integrate this estimation into a bilevel optimization problem, where the data is split into training set and validation set. In theory, we provide the guarantees about the convergence of the approximate procedure. In applications, experimental evaluations on both synthetic and real world data sets consistently validate the effectiveness of the proposed approach.
Paper Structure (12 sections, 2 theorems, 42 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 12 sections, 2 theorems, 42 equations, 3 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Suppose the loss function $\ell$ is Lipschitz smooth with constant $L$ and has $\rho$-bounded gradient with respect to training and validation data. Let the learning rate $\eta_t, \nu_t, 1\leq t\leq T$ be monotonically descent sequences, and satisfy $\eta_t = \min\{\frac{1}{L},\frac{c}{\sqrt{T}}\}$, where $C$ is a constant independent of the iteration process.

Figures (3)

  • Figure 1: Overview of the proposed GSAMUL.
  • Figure 2: The convergence curves of GSAMUL for Example A (left) and Example B (right).
  • Figure 3: The estimates of component curves and link function (red- GSAMUL estimator; blue-true functions) for Example A (top) and Example B (bottom).

Theorems & Definitions (5)

  • Remark 1
  • Remark 2
  • Remark 3
  • Theorem 1
  • Theorem 2