Generalized Sparse Additive Model with Unknown Link Function
Peipei Yuan, Xinge You, Hong Chen, Xuelin Zhang, Qinmu Peng
TL;DR
This work introduces GSAMUL, a generalized sparse additive model with an unknown link function that jointly estimates additive components via a B-spline basis and the link function via a multilayer perceptron, while enforcing sparsity with an $\ell_{2,1}$ penalty. A bilevel optimization scheme with training/validation splits enables simultaneous function estimation and variable selection, and a stability-based procedure identifies informative variables. The authors prove convergence to critical points under standard assumptions and demonstrate superior performance on synthetic and real data, particularly in detecting hidden interactions and excluding irrelevant features. Collectively, GSAMUL offers an interpretable, scalable approach for high-dimensional problems where the link function is not known a priori, improving both predictive accuracy and variable selection.
Abstract
Generalized additive models (GAM) have been successfully applied to high dimensional data analysis. However, most existing methods cannot simultaneously estimate the link function, the component functions and the variable interaction. To alleviate this problem, we propose a new sparse additive model, named generalized sparse additive model with unknown link function (GSAMUL), in which the component functions are estimated by B-spline basis and the unknown link function is estimated by a multi-layer perceptron (MLP) network. Furthermore, $\ell_{2,1}$-norm regularizer is used for variable selection. The proposed GSAMUL can realize both variable selection and hidden interaction. We integrate this estimation into a bilevel optimization problem, where the data is split into training set and validation set. In theory, we provide the guarantees about the convergence of the approximate procedure. In applications, experimental evaluations on both synthetic and real world data sets consistently validate the effectiveness of the proposed approach.
