Table of Contents
Fetching ...

ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

Yunhai Xiao, Anqi Liu, Peili Li, Yanyun Ding

TL;DR

This paper focuses on a particular type of nonsmooth convex sparse optimization problem and presents a new bilevel optimization framework that effectively integrates the alternating direction method of multipliers with a bilevel descent aggregation (BDA) algorithm.

Abstract

It is widely acknowledged that hyperparameter selection plays a critical role in the effectiveness of sparse optimization problems. The bilevel optimization provides a robust framework for addressing this issue, but these existing methods depend heavily on the lower-level singleton (LLS) assumption, which greatly limits their practical applicabilities. To tackle this technical challenge, this paper focus on a particular type of nonsmooth convex sparse optimization problem and presents a new bilevel optimization framework. This framework effectively integrates the alternating direction method of multipliers (ADMM) with a bilevel descent aggregation (BDA) algorithm. Specifically, it employs ADMM to efficiently address the lower-level problem and uses BDA to explore the hyperparameter space, thereby integrating both the upper and lower-level problems. It is important to emphasize that a key contribution of this paper lies in the presentation of a novel convergence analysis. The analysis illustrates that the proposed ADMM-BDA algorithm achieves global convergence under significantly relaxed conditions, thereby departing from the LLS assumption that are often required in the literature. We conduct a series of numerical experiments utilizing synthetic and real-world data, and do performance comparisions against some state-of-the-art algorithms. The results indicates that ADMM-BDA exhibits superior effectiveness and robustness for solving bilevel programming problems, especially when the lower-level problem is an elastic-net penalized statistics problem.

ADMM-based Bilevel Descent Aggregation Algorithm for Sparse Hyperparameter Selection

TL;DR

This paper focuses on a particular type of nonsmooth convex sparse optimization problem and presents a new bilevel optimization framework that effectively integrates the alternating direction method of multipliers with a bilevel descent aggregation (BDA) algorithm.

Abstract

It is widely acknowledged that hyperparameter selection plays a critical role in the effectiveness of sparse optimization problems. The bilevel optimization provides a robust framework for addressing this issue, but these existing methods depend heavily on the lower-level singleton (LLS) assumption, which greatly limits their practical applicabilities. To tackle this technical challenge, this paper focus on a particular type of nonsmooth convex sparse optimization problem and presents a new bilevel optimization framework. This framework effectively integrates the alternating direction method of multipliers (ADMM) with a bilevel descent aggregation (BDA) algorithm. Specifically, it employs ADMM to efficiently address the lower-level problem and uses BDA to explore the hyperparameter space, thereby integrating both the upper and lower-level problems. It is important to emphasize that a key contribution of this paper lies in the presentation of a novel convergence analysis. The analysis illustrates that the proposed ADMM-BDA algorithm achieves global convergence under significantly relaxed conditions, thereby departing from the LLS assumption that are often required in the literature. We conduct a series of numerical experiments utilizing synthetic and real-world data, and do performance comparisions against some state-of-the-art algorithms. The results indicates that ADMM-BDA exhibits superior effectiveness and robustness for solving bilevel programming problems, especially when the lower-level problem is an elastic-net penalized statistics problem.
Paper Structure (10 sections, 8 theorems, 96 equations, 2 figures, 4 tables)

This paper contains 10 sections, 8 theorems, 96 equations, 2 figures, 4 tables.

Key Result

Lemma 4.1

Suppose Assumption ass2 holds. Given $\lambda^k\in \Lambda$, define $\omega_l^{j}:=({y}_l^{(j)},{z}_l^{(j)},x_l^{(j)})$ and $\omega^{j}=({y}_l^{(j)},{z}_l^{(j)},x^{(j)})$. For all $j\geq 0$, and the $y$-, $z$-, and $x$-subproblems associated with suby0-subx0, we get that:

Figures (2)

  • Figure 1: Numerical performance of algorithms on elastic-net penalized lower-level problem.
  • Figure 2: Numerical performance of algorithms on generalized-elastic-net penalized lower-level problem across various noise types.

Theorems & Definitions (16)

  • Lemma 4.1
  • proof
  • Lemma 4.2
  • proof
  • Lemma 4.3
  • proof
  • Lemma 4.4
  • proof
  • Proposition 1
  • proof
  • ...and 6 more