On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
Motahareh Sohrabi, Juan Ramirez, Tianyue H. Zhang, Simon Lacoste-Julien, Jose Gallego-Posada
TL;DR
Constrained optimization in neural networks often suffers from unstable gradient-descent-ascent dynamics; this work introduces νPI, a PI-like multiplier updater augmented with an exponential moving average, to stabilize Lagrange multiplier dynamics. νPI generalizes momentum methods (Polyak, Nesterov) and OG via a unifying mapping, offers qualitative and quantitative insights into its damping behavior, and demonstrates improved stability across SVMs, fairness, and sparsity tasks. Theoretical analysis reveals continuous-time oscillator dynamics and conditions for critical damping that surpass GA, while practical guidance and extensive experiments validate robust convergence and performance gains. Overall, νPI provides a reliable, hyperparameter-friendly mechanism for enforcing constraints in large-scale, nonconvex learning problems, with implications for safety, fairness, and model compression.
Abstract
Constrained optimization offers a powerful framework to prescribe desired behaviors in neural network models. Typically, constrained problems are solved via their min-max Lagrangian formulations, which exhibit unstable oscillatory dynamics when optimized using gradient descent-ascent. The adoption of constrained optimization techniques in the machine learning community is currently limited by the lack of reliable, general-purpose update schemes for the Lagrange multipliers. This paper proposes the $ν$PI algorithm and contributes an optimization perspective on Lagrange multiplier updates based on PI controllers, extending the work of Stooke, Achiam and Abbeel (2020). We provide theoretical and empirical insights explaining the inability of momentum methods to address the shortcomings of gradient descent-ascent, and contrast this with the empirical success of our proposed $ν$PI controller. Moreover, we prove that $ν$PI generalizes popular momentum methods for single-objective minimization. Our experiments demonstrate that $ν$PI reliably stabilizes the multiplier dynamics and its hyperparameters enjoy robust and predictable behavior.
