Differentiation of inertial methods for optimizing smooth parametric function

Jean-Jacques Godeme

Differentiation of inertial methods for optimizing smooth parametric function

Jean-Jacques Godeme

TL;DR

This work analyzes how inertial optimization methods for smooth, strongly convex parametric problems can be differentiated with respect to a parameter $\theta$ using automatic differentiation. It establishes existence and uniqueness of the minimizer $x^*(\theta)$, proves global convergence and local linear rates for a broad class of inertial schemes, and derives explicit formulas for the derivative $\partial_{\theta}X^*(\theta)$ with convergence of the derivatives to the limit. A key contribution is the derivative-stability result, showing that $\partial_{\theta}X_k(\theta)$ converges to $\partial_{\theta}X^*(\theta)$ without requiring global Lipschitz bounds on second-order derivatives, and with a local linear rate for the derivative that includes a vanishing error term. The paper also provides numerical experiments on least-squares problems and a log-exponential model to illustrate state and derivative convergence and to validate the theoretical results. Overall, it offers a rigorous framework for differentiating inertial methods in parametric optimization, with broad implications for hyperparameter tuning and bilevel optimization in practice.

Abstract

In this paper, we consider the minimization of a $C^2-$smooth and strongly convex objective depending on a given parameter, which is usually found in many practical applications. We suppose that we desire to solve the problem with some inertial methods which cover a broader existing well-known inertial methods. Our main goal is to analyze the derivative of this algorithm as an infinite iterative process in the sense of ``automatic'' differentiation. This procedure is very common and has gain more attention recently. From a pure optimization perspective and under some mild premises, we show that any sequence generated by these inertial methods converge to the unique minimizer of the problem, which depends on the parameter. Moreover, we show a local linear convergence rate of the generated sequence. Concerning the differentiation of the scheme, we prove that the derivative of the sequence with respect to the parameter converges to the derivative of the limit of the sequence showing that any sequence is <<derivative stable>>. Finally, we investigate the rate at which the convergence occurs. We show that, this is locally linear with an error term tending to zero.

Differentiation of inertial methods for optimizing smooth parametric function

TL;DR

This work analyzes how inertial optimization methods for smooth, strongly convex parametric problems can be differentiated with respect to a parameter

using automatic differentiation. It establishes existence and uniqueness of the minimizer

, proves global convergence and local linear rates for a broad class of inertial schemes, and derives explicit formulas for the derivative

with convergence of the derivatives to the limit. A key contribution is the derivative-stability result, showing that

converges to

without requiring global Lipschitz bounds on second-order derivatives, and with a local linear rate for the derivative that includes a vanishing error term. The paper also provides numerical experiments on least-squares problems and a log-exponential model to illustrate state and derivative convergence and to validate the theoretical results. Overall, it offers a rigorous framework for differentiating inertial methods in parametric optimization, with broad implications for hyperparameter tuning and bilevel optimization in practice.

Abstract

In this paper, we consider the minimization of a

smooth and strongly convex objective depending on a given parameter, which is usually found in many practical applications. We suppose that we desire to solve the problem with some inertial methods which cover a broader existing well-known inertial methods. Our main goal is to analyze the derivative of this algorithm as an infinite iterative process in the sense of ``automatic'' differentiation. This procedure is very common and has gain more attention recently. From a pure optimization perspective and under some mild premises, we show that any sequence generated by these inertial methods converge to the unique minimizer of the problem, which depends on the parameter. Moreover, we show a local linear convergence rate of the generated sequence. Concerning the differentiation of the scheme, we prove that the derivative of the sequence with respect to the parameter converges to the derivative of the limit of the sequence showing that any sequence is <<derivative stable>>. Finally, we investigate the rate at which the convergence occurs. We show that, this is locally linear with an error term tending to zero.

Differentiation of inertial methods for optimizing smooth parametric function

TL;DR

Abstract

Differentiation of inertial methods for optimizing smooth parametric function

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (23)