Table of Contents
Fetching ...

Implicit Differentiation for Hyperparameter Tuning the Weighted Graphical Lasso

Can Pouliquen, Paulo Gonçalves, Mathurin Massias, Titouan Vayer

TL;DR

This work addresses tuning the Graphical Lasso hyperparameters by casting it as a bilevel optimization problem and deriving the hypergradient via implicit differentiation. The core contribution is a closed-form Jacobian for the GLASSO solution with respect to scalar and matrix regularization parameters, obtained through a fixed-point differentiation of the proximal update and a careful handling of non-smoothness. The authors extend the scalar case to a matrix of hyperparameters, yielding a fourth-order Jacobian tensor and showing how to reuse a Kronecker-inverse to reduce computation. Empirical results on synthetic data demonstrate that the proposed first-order approach can match grid-search in the scalar case and that matrix regularization offers substantial performance gains, albeit with non-convexity challenges that motivate further optimization refinements.

Abstract

We provide a framework and algorithm for tuning the hyperparameters of the Graphical Lasso via a bilevel optimization problem solved with a first-order method. In particular, we derive the Jacobian of the Graphical Lasso solution with respect to its regularization hyperparameters.

Implicit Differentiation for Hyperparameter Tuning the Weighted Graphical Lasso

TL;DR

This work addresses tuning the Graphical Lasso hyperparameters by casting it as a bilevel optimization problem and deriving the hypergradient via implicit differentiation. The core contribution is a closed-form Jacobian for the GLASSO solution with respect to scalar and matrix regularization parameters, obtained through a fixed-point differentiation of the proximal update and a careful handling of non-smoothness. The authors extend the scalar case to a matrix of hyperparameters, yielding a fourth-order Jacobian tensor and showing how to reuse a Kronecker-inverse to reduce computation. Empirical results on synthetic data demonstrate that the proposed first-order approach can match grid-search in the scalar case and that matrix regularization offers substantial performance gains, albeit with non-convexity challenges that motivate further optimization refinements.

Abstract

We provide a framework and algorithm for tuning the hyperparameters of the Graphical Lasso via a bilevel optimization problem solved with a first-order method. In particular, we derive the Jacobian of the Graphical Lasso solution with respect to its regularization hyperparameters.
Paper Structure (12 sections, 3 theorems, 21 equations, 3 figures)

This paper contains 12 sections, 3 theorems, 21 equations, 3 figures.

Key Result

Proposition 1

Let $\hat{\boldsymbol{\Theta}}(\lambda)$ be a solution of eq:graphical_lasso. Then, using Fermat's rule and the expression of the subdifferential of the $\ell_1$-norm beck2017first,

Figures (3)

  • Figure 1: Value of the criterion $\mathcal{C}$w.r.t.$\lambda$ for grid-search and our method, along with the oracle RE.
  • Figure 2: Outer objective value for the bilevel problem along iterations of hypergradient descent.
  • Figure 3: Visualization of the matrices $\mathbf{\Lambda}^\mathrm{opt}$, $\boldsymbol{\Theta}_\mathrm{true}$ and $\widehat{\boldsymbol{\Theta}}(\Lambda^\mathrm{opt})$.

Theorems & Definitions (4)

  • Proposition 1
  • Proposition 3
  • proof
  • Proposition 4