The Generalized Elastic Net for least squares regression with network-aligned signal and correlated design
Huy Tran, Sansen Wei, Claire Donnat
TL;DR
The paper introduces the Generalized Elastic Net (GEN), a regression framework that uses a graph-incidence-based $\ell_1+\ell_2$ penalty to exploit smoothness or piecewise-constant structure of the signal with respect to a given graph, addressing correlated design by augmenting the traditional loss with a graph-aware regularization. It provides non-asymptotic error bounds that depend on the graph, the spectrum of $\Sigma$, and a decomposition of the signal into kernel and orthogonal components, showing that the $\ell_2$ term improves conditioning and tightens prediction and estimation guarantees. A dual-coordinate-descent algorithm is developed for efficient computation at scale, with runtime analysis and comparisons against IP/ADMM/ECOS demonstrating favorable scaling. Extensive synthetic and real-data experiments (including COVID-19, Alzheimer's disease, and Chicago crime datasets) illustrate GEN’s superior performance when signals align with the graph, highlighting its adaptability to diverse graph structures and correlated designs. The work also discusses practical considerations and limitations, such as hyperparameter tuning and the impact of the network-alignment assumption on results.
Abstract
We propose a novel $\ell_1+\ell_2$-penalty, which we refer to as the Generalized Elastic Net, for regression problems where the feature vectors are indexed by vertices of a given graph and the true signal is believed to be smooth or piecewise constant with respect to this graph. Under the assumption of correlated Gaussian design, we derive upper bounds for the prediction and estimation errors, which are graph-dependent and consist of a parametric rate for the unpenalized portion of the regression vector and another term that depends on our network alignment assumption. We also provide a coordinate descent procedure based on the Lagrange dual objective to compute this estimator for large-scale problems. Finally, we compare our proposed estimator to existing regularized estimators on a number of real and synthetic datasets and discuss its potential limitations.
