A Structured Tour of Optimization with Finite Differences

Marco Rando; Cesare Molinari; Lorenzo Rosasco; Silvia Villa

A Structured Tour of Optimization with Finite Differences

Marco Rando, Cesare Molinari, Lorenzo Rosasco, Silvia Villa

TL;DR

This paper analyzes structured versus unstructured direction generation for finite-difference zeroth-order optimization under fixed evaluation budgets. It reviews and extends several structured constructions (e.g., QR-based orthogonalization, random Householder, Butterfly, and permuted variants) and benchmarks them against unstructured methods on synthetic, CUTEst, and high-dimensional adversarial MNIST tasks. The findings show that structured directions can achieve gradient-approximation quality and convergence comparable to or better than unstructured approaches at similar cost, particularly when the number of directions is a substantial fraction of the dimension (e.g., ell ≥ d/3 or d/2). The results advocate for incorporating structure in direction design for high-dimensional zeroth-order problems and motivate further theory and scalable implementations for large-scale applications such as large language model fine-tuning.

Abstract

Finite-difference methods are widely used for zeroth-order optimization in settings where gradient information is unavailable or expensive to compute. These procedures mimic first-order strategies by approximating gradients through function evaluations along a set of random directions. From a theoretical perspective, recent studies indicate that imposing structure (such as orthogonality) on the chosen directions allows for the derivation of convergence rates comparable to those achieved with unstructured random directions (i.e., directions sampled independently from a distribution). Empirically, although structured directions are expected to enhance performance, they often introduce additional computational costs, which can limit their applicability in high-dimensional settings. In this work, we examine the impact of structured direction selection in finite-difference methods. We review and extend several strategies for constructing structured direction matrices and compare them with unstructured approaches in terms of computational cost, gradient approximation quality, and convergence behavior. Our evaluation spans both synthetic tasks and real-world applications such as adversarial perturbation. The results demonstrate that structured directions can be generated with computational costs comparable to unstructured ones while significantly improving gradient estimation accuracy and optimization performance.

A Structured Tour of Optimization with Finite Differences

TL;DR

Abstract

A Structured Tour of Optimization with Finite Differences

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (2)