Table of Contents
Fetching ...

Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization

He Chen, Jiajin Li, Anthony Man-Cho So

TL;DR

This paper shows that strong (Clarke) hyper-stationarity remains computable even when the hyper-objective is nonsmooth, and proves that a new structural property, called set smoothness, which captures the variational dependence of the lower-level solution set on the upper-level variable holds for a broad class of BLO problems.

Abstract

Solving bilevel optimization (BLO) problems to global optimality is generally intractable. A common surrogate is to compute a hyper-stationary point -- a stationary point of the hyper-objective function obtained by minimizing or maximizing the upper-level objective over the lower-level solution set. Existing methods, however, either provide weak notions of stationarity or require restrictive assumptions to guarantee the smoothness of hyper-objective functions. In this paper, we eliminate these impractical assumptions and show that strong (Clarke) hyper-stationarity remains computable even when the hyper-objective is nonsmooth. Our key ingredient is a new structural property, called set smoothness, which captures the variational dependence of the lower-level solution set on the upper-level variable. We prove that this property holds for a broad class of BLO problems and ensures weak convexity (resp. concavity) of pessimistic (resp. optimistic) hyper-objective functions. Building on this foundation, we show that a zeroth-order algorithm that computes approximate Clarke hyper-stationary points with non-asymptotic convergence guarantees. To the best of our knowledge, this is the first computational guarantee for Clarke-type stationarity in nonsmooth BLO. Beyond this specific application, the set smoothness property emerges as a structural concept of independent interest, with potential to inform the analysis of broader classes of optimization and variational problems.

Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization

TL;DR

This paper shows that strong (Clarke) hyper-stationarity remains computable even when the hyper-objective is nonsmooth, and proves that a new structural property, called set smoothness, which captures the variational dependence of the lower-level solution set on the upper-level variable holds for a broad class of BLO problems.

Abstract

Solving bilevel optimization (BLO) problems to global optimality is generally intractable. A common surrogate is to compute a hyper-stationary point -- a stationary point of the hyper-objective function obtained by minimizing or maximizing the upper-level objective over the lower-level solution set. Existing methods, however, either provide weak notions of stationarity or require restrictive assumptions to guarantee the smoothness of hyper-objective functions. In this paper, we eliminate these impractical assumptions and show that strong (Clarke) hyper-stationarity remains computable even when the hyper-objective is nonsmooth. Our key ingredient is a new structural property, called set smoothness, which captures the variational dependence of the lower-level solution set on the upper-level variable. We prove that this property holds for a broad class of BLO problems and ensures weak convexity (resp. concavity) of pessimistic (resp. optimistic) hyper-objective functions. Building on this foundation, we show that a zeroth-order algorithm that computes approximate Clarke hyper-stationary points with non-asymptotic convergence guarantees. To the best of our knowledge, this is the first computational guarantee for Clarke-type stationarity in nonsmooth BLO. Beyond this specific application, the set smoothness property emerges as a structural concept of independent interest, with potential to inform the analysis of broader classes of optimization and variational problems.

Paper Structure

This paper contains 21 sections, 12 theorems, 117 equations, 2 algorithms.

Key Result

lemma 1

(cf. chen2023bilevel) Under Assumption assum:basic, the lower-level solution set function is $M_{{\cal S}}$-Lipschitz continuous with $M_{{\cal S}}= L_f\tau$, i.e., for any ${\bm x}_1,{\bm x}_2\in{\mathbb{R}}^m$,

Theorems & Definitions (36)

  • lemma 1: Lipschitz Continuity of ${\cal S}({\bm x})$
  • lemma 2
  • definition 1: Clarke Subdifferential
  • remark 1
  • definition 2: Goldstein $\delta$-Subdifferential
  • lemma 3: Equivalent Characterizations of Weak Convexity
  • lemma 4: Properties of Moreau Envelope
  • definition 3: Set Smoothness
  • example 1: Why the condition \ref{['eq:point_set2']} is needed: A trivialization for the condition \ref{['eq:point_set']}
  • theorem 1: Implication of Set Smoothness
  • ...and 26 more