Table of Contents
Fetching ...

Strong Screening Rules for Group-based SLOPE Models

Fabio Feser, Marina Evangelou

TL;DR

This work tackles the high computational cost of tuning regularization in high-dimensional penalized regression by introducing strong screening rules for group-based SLOPE models, namely gSLOPE and SGS, within the broader OWL family. It presents a novel sparse-group screening framework with a two-layer approach (group then variable screening), plus KKT guarantees to prevent discarding active features, and extends these ideas to OSCAR-like models. Theoretical development includes subdifferential-based strong rules and gradient-approximation variants, with proofs provided in the appendices. Empirical results on synthetic and real gene-expression data show substantial runtime reductions and improved convergence without sacrificing solution accuracy, making group-based SLOPE and SGS more scalable for $p \gg n$ scenarios in genetics and beyond.

Abstract

Tuning the regularization parameter in penalized regression models is an expensive task, requiring multiple models to be fit along a path of parameters. Strong screening rules drastically reduce computational costs by lowering the dimensionality of the input prior to fitting. We develop strong screening rules for group-based Sorted L-One Penalized Estimation (SLOPE) models: Group SLOPE and Sparse-group SLOPE. The developed rules are applicable to the wider family of group-based OWL models, including OSCAR. Our experiments on both synthetic and real data show that the screening rules significantly accelerate the fitting process. The screening rules make it accessible for group SLOPE and sparse-group SLOPE to be applied to high-dimensional datasets, particularly those encountered in genetics.

Strong Screening Rules for Group-based SLOPE Models

TL;DR

This work tackles the high computational cost of tuning regularization in high-dimensional penalized regression by introducing strong screening rules for group-based SLOPE models, namely gSLOPE and SGS, within the broader OWL family. It presents a novel sparse-group screening framework with a two-layer approach (group then variable screening), plus KKT guarantees to prevent discarding active features, and extends these ideas to OSCAR-like models. Theoretical development includes subdifferential-based strong rules and gradient-approximation variants, with proofs provided in the appendices. Empirical results on synthetic and real gene-expression data show substantial runtime reductions and improved convergence without sacrificing solution accuracy, making group-based SLOPE and SGS more scalable for scenarios in genetics and beyond.

Abstract

Tuning the regularization parameter in penalized regression models is an expensive task, requiring multiple models to be fit along a path of parameters. Strong screening rules drastically reduce computational costs by lowering the dimensionality of the input prior to fitting. We develop strong screening rules for group-based Sorted L-One Penalized Estimation (SLOPE) models: Group SLOPE and Sparse-group SLOPE. The developed rules are applicable to the wider family of group-based OWL models, including OSCAR. Our experiments on both synthetic and real data show that the screening rules significantly accelerate the fitting process. The screening rules make it accessible for group SLOPE and sparse-group SLOPE to be applied to high-dimensional datasets, particularly those encountered in genetics.
Paper Structure (55 sections, 10 theorems, 56 equations, 24 figures, 12 tables, 2 algorithms)

This paper contains 55 sections, 10 theorems, 56 equations, 24 figures, 12 tables, 2 algorithms.

Key Result

Theorem 3.1

The subdifferential for gSLOPE is given by

Figures (24)

  • Figure 1: The proportion of variables in $\mathcal{S}_v$ relative to $p$ for group-only and bi-level screening applied to SGS, plotted along the regularization path with 95% confidence intervals. Synthetic data was generated under a linear model for $p = 500, 5000$ (Section \ref{['section:results_sim']}), with results averaged over 100 repetitions.
  • Figure 2: The proportion of groups/variables in $\mathcal{E}, \mathcal{A}$, relative to the input, for both gSLOPE and SGS as a function of the path for the linear model with $p=2750, \rho=0.6, m = 197$. The results are averaged over $100$ repetitions, with $95\%$ confidence intervals shown.
  • Figure 3: The proportion of groups/variables in $\mathcal{E}, \mathcal{A}$, relative to the full input, shown for gSLOPE and SGS. This is shown as a function of the correlation ($\rho$), averaged over all cases of the input dimension ($p$), with $100$ repetitions for each $p$, for both linear and logistic models, with standard errors shown.
  • Figure 4: Runtime (in seconds) for fitting $50$ models along a path, shown for screening against no screening as a function of $p$, broken down into different correlation cases, for the linear model. The results are averaged over $100$ repetitions, with standard errors shown.
  • Figure 5: The ratio of no screen time to screen time ($\uparrow$) of gSLOPE and SGS applied to the real datasets, for fitting $100$ path models, split into response type. The horizontal grey line represents no screening improvement.
  • ...and 19 more figures

Theorems & Definitions (20)

  • Theorem 3.1: gSLOPE subdifferential
  • Proposition 3.2: Strong screening rule for gSLOPE
  • Proposition 3.3: Gradient approximation strong screening rule for gSLOPE
  • Proposition 3.4: gSLOPE path start
  • Lemma 4.1
  • Proposition 4.2: Gradient approximation strong group screening rule for SGS
  • Proposition 4.3: Gradient approximation strong variable screening rule for SGS
  • Proposition 4.4: SGS path start
  • proof : Proof of Theorem \ref{['thm:gslope_subdiff']}
  • proof : Proof of Proposition \ref{['propn:gslope_seq_strong']}
  • ...and 10 more