The non-overlapping statistical approximation to overlapping group lasso

Mingyu Qi; Tianxi Li

The non-overlapping statistical approximation to overlapping group lasso

Mingyu Qi, Tianxi Li

TL;DR

The paper addresses the computational bottleneck of overlapping group lasso by introducing a separable, partition-based penalty $\psi^{\mathcal{G}}(\beta)$ that upper-bounds the original $\ell_{q_1}/\ell_{q_2}$-family norm. By partitioning overlaps into disjoint subgroups and reweighting, optimization reduces to a weighted non-overlapping group lasso with theoretical guarantees. The authors prove that the new estimator attains the same minimax error rate and support recovery properties as the original overlapping group lasso under standard assumptions, while delivering substantial computational gains. Empirical results on simulations and a breast cancer pathway analysis show the method yields nearly identical predictive performance to the overlapping group lasso but with much faster computation, highlighting its practical utility for high-dimensional, structured regression problems. Overall, the work provides a principled, tight separable relaxation that broadens the applicability of group-structured regularization in genomics and related domains.

Abstract

Group lasso is a commonly used regularization method in statistical learning in which parameters are eliminated from the model according to predefined groups. However, when the groups overlap, optimizing the group lasso penalized objective can be time-consuming on large-scale problems because of the non-separability induced by the overlapping groups. This bottleneck has seriously limited the application of overlapping group lasso regularization in many modern problems, such as gene pathway selection and graphical model estimation. In this paper, we propose a separable penalty as an approximation of the overlapping group lasso penalty. Thanks to the separability, the computation of regularization based on our penalty is substantially faster than that of the overlapping group lasso, especially for large-scale and high-dimensional problems. We show that the penalty is the tightest separable relaxation of the overlapping group lasso norm within the family of $\ell_{q_1}/\ell_{q_2}$ norms. Moreover, we show that the estimator based on the proposed separable penalty is statistically equivalent to the one based on the overlapping group lasso penalty with respect to their error bounds and the rate-optimal performance under the squared loss. We demonstrate the faster computational time and statistical equivalence of our method compared with the overlapping group lasso in simulation examples and a classification problem of cancer tumors based on gene expression and multiple gene pathways.

The non-overlapping statistical approximation to overlapping group lasso

TL;DR

The paper addresses the computational bottleneck of overlapping group lasso by introducing a separable, partition-based penalty

that upper-bounds the original

-family norm. By partitioning overlaps into disjoint subgroups and reweighting, optimization reduces to a weighted non-overlapping group lasso with theoretical guarantees. The authors prove that the new estimator attains the same minimax error rate and support recovery properties as the original overlapping group lasso under standard assumptions, while delivering substantial computational gains. Empirical results on simulations and a breast cancer pathway analysis show the method yields nearly identical predictive performance to the overlapping group lasso but with much faster computation, highlighting its practical utility for high-dimensional, structured regression problems. Overall, the work provides a principled, tight separable relaxation that broadens the applicability of group-structured regularization in genomics and related domains.

Abstract

norms. Moreover, we show that the estimator based on the proposed separable penalty is statistically equivalent to the one based on the overlapping group lasso penalty with respect to their error bounds and the rate-optimal performance under the squared loss. We demonstrate the faster computational time and statistical equivalence of our method compared with the overlapping group lasso in simulation examples and a classification problem of cancer tumors based on gene expression and multiple gene pathways.

Paper Structure (35 sections, 25 theorems, 244 equations, 2 figures, 3 tables, 2 algorithms)

This paper contains 35 sections, 25 theorems, 244 equations, 2 figures, 3 tables, 2 algorithms.

Introduction
Methodology
Notation and Preliminaries.
Overlapping Group Lasso
The Non-overlapping Approximation of the Overlapping Group Lasso
Step 1: overlapping-induced partition construction.
Step 2: overlapping-based group weights calculation.
Statistical Properties
Estimation Error Bounds
Lower Bound of Estimation Error
Support Recovery Consistency
Application Example: Pathway Analysis of Breast Cancer Data
Discussion
Notation summary
Uniqueness of the overlapping group lasso problem
...and 20 more sections

Key Result

Theorem 1

Let $\mathbb{G}$ represent the set of all possible partitions of $[p]$. Given the original groups $G$ and their weights $w$, there does not exist $0 \leqslant q_1,q_2 \leqslant \infty, \tilde{G} \in \mathbb{G}, \tilde{w} \in (0, \infty)^p$ such that:

Figures (2)

Figure 1: Illustration of proposed group partition in an interlocking group structure. Red regions are the overlapping variables in the original group structure.
Figure 2: Illustration of two norms in $\mathbb{R}^3$: the outer region depicts the unit ball of the overlapping group lasso norm defined by $\{ \beta: \phi^G(\beta) \leqslant 1 \}$; the inner region represents the unit ball of our proposed separable norm $\{ \beta: \psi^{\mathcal{G}}(\beta) \leqslant 1 \}$.

Theorems & Definitions (25)

Theorem 1
Theorem 2
Proposition 3
Theorem 4
Corollary 1
Theorem 5
Lemma 1
Proposition 1
Theorem 6
Theorem 7
...and 15 more

The non-overlapping statistical approximation to overlapping group lasso

TL;DR

Abstract

The non-overlapping statistical approximation to overlapping group lasso

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (25)