On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization
Stephen Gould, Basura Fernando, Anoop Cherian, Peter Anderson, Rodrigo Santa Cruz, Edison Guo
TL;DR
The paper derives exact gradient expressions for differentiating parameterized argmin and argmax problems in bi-level optimization, covering unconstrained and constrained lower-level problems. It presents general implicit-differentiation formulas, extends them to equality and inequality constraints via null-space and barrier methods, and demonstrates applications with scalar and softmax exemplars. A bi-level learning example shows how to adjust model parameters to steer the location of maximum-likelihood features, highlighting practical end-to-end learning potential. The discussion addresses computational considerations and suggests directions for scalable and non-smooth settings in real-world AI tasks.
Abstract
Some recent works in machine learning and computer vision involve the solution of a bi-level optimization problem. Here the solution of a parameterized lower-level problem binds variables that appear in the objective of an upper-level problem. The lower-level problem typically appears as an argmin or argmax optimization problem. Many techniques have been proposed to solve bi-level optimization problems, including gradient descent, which is popular with current end-to-end learning approaches. In this technical report we collect some results on differentiating argmin and argmax optimization problems with and without constraints and provide some insightful motivating examples.
