Optimistic Gradient Learning with Hessian Corrections for High-Dimensional Black-Box Optimization

Yedidya Kfir; Elad Sarafian; Sarit Kraus; Yoram Louzoun

Optimistic Gradient Learning with Hessian Corrections for High-Dimensional Black-Box Optimization

Yedidya Kfir, Elad Sarafian, Sarit Kraus, Yoram Louzoun

TL;DR

This work tackles high-dimensional black-box optimization by introducing Optimistic Hessian Gradient Learning (OHGL), which unifies Evolutionary Gradient Learning (EvoGrad) and Higher-Order Gradient Learning (HGrad). EvoGrad biases gradient estimates toward globally favorable regions using CMA-ES derived weights, while HGrad injects Hessian information to improve gradient accuracy, yielding faster convergence with provable controllable accuracy. The EvoGrad2 variant combines both ideas, using Monte Carlo pairwise sampling and detaching Jacobians to enable scalable learning in large-scale problems; it achieves state-of-the-art results on the COCO suite and demonstrates applicability to adversarial training and code generation. The results show robust performance across dimensions, budgets, and accuracy requirements, highlighting EvoGrad2 as a practical tool for high-dimensional, non-linear black-box optimization in ML research and real-world tasks.

Abstract

Black-box algorithms are designed to optimize functions without relying on their underlying analytical structure or gradient information, making them essential when gradients are inaccessible or difficult to compute. Traditional methods for solving black-box optimization (BBO) problems predominantly rely on non-parametric models and struggle to scale to large input spaces. Conversely, parametric methods that model the function with neural estimators and obtain gradient signals via backpropagation may suffer from significant gradient errors. A recent alternative, Explicit Gradient Learning (EGL), which directly learns the gradient using a first-order Taylor approximation, has demonstrated superior performance over both parametric and non-parametric methods. In this work, we propose two novel gradient learning variants to address the robustness challenges posed by high-dimensional, complex, and highly non-linear problems. Optimistic Gradient Learning (OGL) introduces a bias toward lower regions in the function landscape, while Higher-order Gradient Learning (HGL) incorporates second-order Taylor corrections to improve gradient accuracy. We combine these approaches into the unified OHGL algorithm, achieving state-of-the-art (SOTA) performance on the synthetic COCO suite. Additionally, we demonstrate OHGLs applicability to high-dimensional real-world machine learning (ML) tasks such as adversarial training and code generation. Our results highlight OHGLs ability to generate stronger candidates, offering a valuable tool for ML researchers and practitioners tackling high-dimensional, non-linear optimization challenges

Optimistic Gradient Learning with Hessian Corrections for High-Dimensional Black-Box Optimization

TL;DR

Abstract

Optimistic Gradient Learning with Hessian Corrections for High-Dimensional Black-Box Optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (9)