Table of Contents
Fetching ...

Neural Global Optimization via Iterative Refinement from Noisy Samples

Qusay Muzaffar, David Levin, Michael Werman

Abstract

Global optimization of black-box functions from noisy samples is a fundamental challenge in machine learning and scientific computing. Traditional methods such as Bayesian Optimization often converge to local minima on multi-modal functions, while gradient-free methods require many function evaluations. We present a novel neural approach that learns to find global minima through iterative refinement. Our model takes noisy function samples and their fitted spline representation as input, then iteratively refines an initial guess toward the true global minimum. Trained on randomly generated functions with ground truth global minima obtained via exhaustive search, our method achieves a mean error of 8.05 percent on challenging multi-modal test functions, compared to 36.24 percent for the spline initialization, a 28.18 percent improvement. The model successfully finds global minima in 72 percent of test cases with error below 10 percent, demonstrating learned optimization principles rather than mere curve fitting. Our architecture combines encoding of multiple modalities including function values, derivatives, and spline coefficients with iterative position updates, enabling robust global optimization without requiring derivative information or multiple restarts.

Neural Global Optimization via Iterative Refinement from Noisy Samples

Abstract

Global optimization of black-box functions from noisy samples is a fundamental challenge in machine learning and scientific computing. Traditional methods such as Bayesian Optimization often converge to local minima on multi-modal functions, while gradient-free methods require many function evaluations. We present a novel neural approach that learns to find global minima through iterative refinement. Our model takes noisy function samples and their fitted spline representation as input, then iteratively refines an initial guess toward the true global minimum. Trained on randomly generated functions with ground truth global minima obtained via exhaustive search, our method achieves a mean error of 8.05 percent on challenging multi-modal test functions, compared to 36.24 percent for the spline initialization, a 28.18 percent improvement. The model successfully finds global minima in 72 percent of test cases with error below 10 percent, demonstrating learned optimization principles rather than mere curve fitting. Our architecture combines encoding of multiple modalities including function values, derivatives, and spline coefficients with iterative position updates, enabling robust global optimization without requiring derivative information or multiple restarts.

Paper Structure

This paper contains 32 sections, 41 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Detailed architecture showing all components. MainEncoder (top): Four modalities encoded independently via linear+StableCubic, concatenated, fused through U-Net (4 encoder stages, bottleneck, 3 decoder stages with skip connections), then multi-scale pooled (global/focus/local) and projected to $\mathbf{e}_0 \in \mathbb{R}^{64}$. Iterator (middle): Takes encoding, position, and previous step; feeds through MLP(256); produces direction (tanh head) and step size (softplus head); updates position. Updater (right): Decompresses encoding back to per-sample features via expansion and inverse U-Net; applies four separate Modifiers conditioned on new position and step; re-encodes via U-Net and multi-scale pooling to produce $\mathbf{e}_{t+1}$. Loop continues until step size variance $< 10^{-5}$ or 40 iterations.
  • Figure 2: Training convergence over 300,000 epochs. Error stabilizes around 10% after epoch 100,000, indicating the model has learned effective optimization strategies. Remaining error reflects the inherent difficulty of global optimization on NIGHTMARE functions with 30% noise.
  • Figure 3: Error distribution on 50 test functions. The distribution is heavily skewed toward low error, with median (5.73%) significantly below mean (8.05%). Most cases achieve excellent performance, with occasional challenging outliers.
  • Figure 4: Example test cases showing model finding global minima. Left: Case with poor spline initialization (x=0.026, far from true x=0.368); model refines to x=0.371 (0.3% error). Right: Multi-modal function where model improves from 26.2% (spline) to 1.0% (model).