ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

Yunhai Xiao; Anqi Liu; Peili Li; Yanyun Ding

ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

Yunhai Xiao, Anqi Liu, Peili Li, Yanyun Ding

TL;DR

This paper focuses on a particular type of nonsmooth convex sparse optimization problem and presents a new bilevel optimization framework that effectively integrates the alternating direction method of multipliers with a bilevel descent aggregation (BDA) algorithm.

Abstract

It is widely acknowledged that hyperparameter selection plays a critical role in the effectiveness of sparse optimization problems. The bilevel optimization provides a robust framework for addressing this issue, but these existing methods depend heavily on the lower-level singleton (LLS) assumption, which greatly limits their practical applicabilities. To tackle this technical challenge, this paper focus on a particular type of nonsmooth convex sparse optimization problem and presents a new bilevel optimization framework. This framework effectively integrates the alternating direction method of multipliers (ADMM) with a bilevel descent aggregation (BDA) algorithm. Specifically, it employs ADMM to efficiently address the lower-level problem and uses BDA to explore the hyperparameter space, thereby integrating both the upper and lower-level problems. It is important to emphasize that a key contribution of this paper lies in the presentation of a novel convergence analysis. The analysis illustrates that the proposed ADMM-BDA algorithm achieves global convergence under significantly relaxed conditions, thereby departing from the LLS assumption that are often required in the literature. We conduct a series of numerical experiments utilizing synthetic and real-world data, and do performance comparisions against some state-of-the-art algorithms. The results indicates that ADMM-BDA exhibits superior effectiveness and robustness for solving bilevel programming problems, especially when the lower-level problem is an elastic-net penalized statistics problem.

ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

TL;DR

Abstract

Paper Structure (10 sections, 8 theorems, 96 equations, 2 figures, 4 tables)

This paper contains 10 sections, 8 theorems, 96 equations, 2 figures, 4 tables.

Introduction
Preliminaries
ADMM-BDA for solving \ref{['model0']}
Convergence analysis
Experimental Evaluation of Algorithms on Synthetic and Real-World Data
Experiments with Synthetic Data
Testing for Elastic-Net Penalized lower-level Problem
Testing for Generalized-Elastic-Net Penalized lower-level Problem
Experiments with Real-World Data
Conclusion

Key Result

Lemma 4.1

Suppose Assumption ass2 holds. Given $\lambda^k\in \Lambda$, define $\omega_l^{j}:=({y}_l^{(j)},{z}_l^{(j)},x_l^{(j)})$ and $\omega^{j}=({y}_l^{(j)},{z}_l^{(j)},x^{(j)})$. For all $j\geq 0$, and the $y$-, $z$-, and $x$-subproblems associated with suby0-subx0, we get that:

Figures (2)

Figure 1: Numerical performance of algorithms on elastic-net penalized lower-level problem.
Figure 2: Numerical performance of algorithms on generalized-elastic-net penalized lower-level problem across various noise types.

Theorems & Definitions (16)

Lemma 4.1
proof
Lemma 4.2
proof
Lemma 4.3
proof
Lemma 4.4
proof
Proposition 1
proof
...and 6 more

ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

TL;DR

Abstract

ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (16)