A Correspondence-Driven Approach for Bilevel Decision-making with Nonconvex Lower-Level Problems
Xiaotian Jiang, Jiaxiang Li, Mingyi Hong, Shuzhong Zhang
TL;DR
This work tackles bilevel optimization with nonconvex lower-level objectives by replacing the traditional rational-follower assumption with a correspondence-driven hyperfunction φ^{cd} that reflects algorithmic, bounded rationality. To handle discontinuities, it introduces Gaussian smoothing φ^{cd}_ξ, and proves convergence of the smoothed value and gradient to their nondiscretized counterparts at appropriate points. It then develops SCiNBiO, a biased SGD-based method using a cubic-regularized Newton lower-level solver, and provides convergence and oracle-complexity guarantees, including a fold-bifurcation-based refinement of lower-level complexity. The framework leverages a prevalence-based perspective on regularity to address bifurcation phenomena, linking dynamical-systems fold bifurcations to the geometry of the lower-level landscape. Experiments demonstrate SCiNBiO’s robustness against nonconvexity and its superiority over competing BLO methods in both minimax and hyperparameter-optimization tasks.
Abstract
We consider bilevel optimization problems with general nonconvex lower-level objectives and show that the classical hyperfunction-based formulation is unsettled, since the global minimizer of the lower-level problem is generally unattainable. To address this issue, we propose a correspondence-driven hyperfunction $φ^{\text{cd}}$. In this formulation, the follower is modeled not as a rational agent always attaining a global minimizer, but as an algorithm-based bounded rational agent whose decisions are produced by a fixed algorithm with initialization and step size. Since $φ^{\text{cd}}$ is generally discontinuous, we apply Gaussian smoothing to obtain a smooth approximation $φ^{\text{cd}}_ξ$, then show that its value and gradient converge to those of $φ^{\text{cd}}$. In the nonconvex setting, we identify that bifurcation phenomena, which arise when $g(x,\cdot)$ has a degenerate stationary point, pose a key challenge for hyperfunction-based methods. This is especially the case when $φ^{\text{cd}}_ξ$ is solved using gradient methods. To overcome this challenge, we analyze the geometric structure of the bifurcation set under some weak assumptions. Building on these results, we design a biased projected SGD-based algorithm SCiNBiO to solve $φ^{\text{cd}}_ξ$ with a cubic-regularized Newton lower-level solver. We also provide convergence guarantees and oracle complexity bounds for the upper level. Finally, we connect bifurcation theory from dynamical systems to the bilevel setting and define the notion of fold bifurcation points in this setting. Under the assumption that all degenerate stationary points are fold bifurcation points, we establish the oracle complexity of SCiNBiO for the lower-level problem.
