Accelerating Diagonal Methods for Bilevel Optimization: Unified Convergence via Continuous-Time Dynamics
Radu Ioan Boţ, Enis Chenchene, Ernö Robert Csetnek, David Alexander Hulett
TL;DR
This work addresses efficient solution of bilevel programs where the lower problem decouples from the upper variable, by developing diagonal (Tikhonov) methods that discretize continuous-time dynamics. Two algorithms are analyzed: a proximal-gradient method (first-order) and a fast proximal-gradient method with Nesterov momentum (second-order), each with a polynomially decaying regularization ε_k = c/(k+β)^δ. A unified Lyapunov-based framework yields explicit convergence rates under Hölderian growth or the Attouch–Czarnecki condition and establishes weak convergence to bilevel solutions in infinite-dimensional Hilbert spaces. The results extend prior work with general geometric assumptions, enable flexible parameter schedules, and are supported by numerical experiments on linear inverse problems and logistic regression. The continuous-time analysis provides a principled basis for the discrete schemes and highlights the trade-offs between geometry, regularization decay, and acceleration.
Abstract
We analyze fast diagonal methods for simple bilevel programs. Guided by the analysis of the corresponding continuous-time dynamics, we provide a unified convergence analysis under general geometric conditions, including Hölderian growth and the Attouch-Czarnecki condition. Our results yield explicit convergence rates and guarantee weak convergence to a solution of the bilevel problem. In particular, we improve and extend recent results on accelerated schemes, offering novel insights into the trade-offs between geometry, regularization decay, and algorithmic design. Numerical experiments illustrate the advantages of more flexible methods and support our theoretical findings.
