Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization
James Kotary, Jacob Christopher, My H Dinh, Ferdinando Fioretto
TL;DR
This work tackles the challenge of differentiating through optimization layers in neural networks by analyzing the backward pass of unrolled optimization and showing its asymptotic equivalence to solving a linear system via a fixed-point iteration. It introduces unfolded optimization, which defers inner differentiations to Jacobian-gradient products, and further develops Folded Optimization, which separates forward and backward passes and solves the backward problem with efficient linear-algebra methods (e.g., LFPI and Krylov methods) using only Jacobian-vector products. The authors provide theoretical results on backward-pass convergence, empirical validations of potential pitfalls in naive unrolling, and a practical open-source library, fold-opt, that enables flexible, efficient, and differentiable optimization layers across nonconvex and convex problems and even blackbox solvers. The framework delivers significant computational gains and modeling flexibility across decision-focused learning tasks, AC-OPF, portfolio optimization, denoising, and multilabel classification, by enabling robust, differentiable end-to-end optimization pipelines. The work thus offers a versatile bridge between differentiable optimization and scalable, task-specific solvers with broad practical impact for end-to-end learning systems.
Abstract
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the entire chain of operations executed by an iterative optimization solver. This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is asymptotically equivalent to the solution of a linear system by a particular iterative method. Several practical pitfalls of unrolling are demonstrated in light of these insights, and a system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations. Experiments over various end-to-end optimization and learning tasks demonstrate the advantages of this system both computationally, and in terms of flexibility over various optimization problem forms.
