Table of Contents
Fetching ...

Decentralized Implicit Differentiation

Lucas Fuentes Valenzuela, Robin Brown, Marco Pavone

TL;DR

This work discusses a decentralized framework for computing gradients of constraint-coupled optimization problems, and shows that this framework results in significant computational gains, especially for large systems, and provide sufficient conditions for its validity.

Abstract

The ability to differentiate through optimization problems has unlocked numerous applications, from optimization-based layers in machine learning models to complex design problems formulated as bilevel programs. It has been shown that exploiting problem structure can yield significant computation gains for optimization and, in some cases, enable distributed computation. One should expect that this structure can be similarly exploited for gradient computation. In this work, we discuss a decentralized framework for computing gradients of constraint-coupled optimization problems. First, we show that this framework results in significant computational gains, especially for large systems, and provide sufficient conditions for its validity. Second, we leverage exponential decay of sensitivities in graph-structured problems towards building a fully distributed algorithm with convergence guarantees. Finally, we use the methodology to rigorously estimate marginal emissions rates in power systems models. Specifically, we demonstrate how the distributed scheme allows for accurate and efficient estimation of these important emissions metrics on large dynamic power system models.

Decentralized Implicit Differentiation

TL;DR

This work discusses a decentralized framework for computing gradients of constraint-coupled optimization problems, and shows that this framework results in significant computational gains, especially for large systems, and provide sufficient conditions for its validity.

Abstract

The ability to differentiate through optimization problems has unlocked numerous applications, from optimization-based layers in machine learning models to complex design problems formulated as bilevel programs. It has been shown that exploiting problem structure can yield significant computation gains for optimization and, in some cases, enable distributed computation. One should expect that this structure can be similarly exploited for gradient computation. In this work, we discuss a decentralized framework for computing gradients of constraint-coupled optimization problems. First, we show that this framework results in significant computational gains, especially for large systems, and provide sufficient conditions for its validity. Second, we leverage exponential decay of sensitivities in graph-structured problems towards building a fully distributed algorithm with convergence guarantees. Finally, we use the methodology to rigorously estimate marginal emissions rates in power systems models. Specifically, we demonstrate how the distributed scheme allows for accurate and efficient estimation of these important emissions metrics on large dynamic power system models.
Paper Structure (26 sections, 5 theorems, 29 equations, 4 figures)

This paper contains 26 sections, 5 theorems, 29 equations, 4 figures.

Key Result

Theorem 1

Consider a constraint-coupled optimization problem in the form of Problem pbm:CCOP. If Assumptions A1, A2, A4, A5 hold for the global problem and if Assumption A3 holds for each subproblem individually, then all $N$ local Jacobians and the coupling Jacobian in decentralized implicit differentiation

Figures (4)

  • Figure 1: Illustrative bi-partite graph linking subproblems in orange and coupling constraints in blue.
  • Figure 2: Numerical experiments testing the complexity model of \ref{['eq:complexity']}. The decentralized scheme delivers significant computational gains in comparison to centralized differentiation, especially for weakly coupled problems (i.e. small $\rho$).
  • Figure 3: Relative error between the coupling Jacobian estimate and its true value for a randomly generated problem with the following parameters: each of the $N=50$ subproblems is constrained by $l=2$ inequality and $k=2$ equality constraints, and is involved in 2 equality coupling constraints. The distributed scheme converges linearly, at a rate that increases with $\omega$.
  • Figure 4: Approximation error of the marginal emissions rates as a function of the horizon parameter $\omega$, under different storage penetration scenarios for a 50-node network. Full line: median over all nodes in the network. Ribbons indicate the 10th and 90th percentiles. The approximation error decreases exponentially with $\omega$, at a rate that depends on storage penetration.

Theorems & Definitions (10)

  • Theorem 1: Decentralized implicit differentiation
  • Lemma 1
  • Theorem 2: Adapted from Shin2020-jc
  • proof
  • Lemma 2
  • proof
  • proof
  • Lemma 3
  • proof
  • proof