Table of Contents
Fetching ...

NeuroPareto: Calibrated Acquisition for Costly Many-Goal Search in Vast Parameter Spaces

Rong Fu, Wenxin Zhang, Chunlei Meng, Youjin Wang, Haoyu Zhao, Jiaxuan Lu, Kun Liu, JiaBao Dou, Simon James Fong

TL;DR

NeuroPareto tackles expensive, high-dimensional multi-objective optimization by jointly calibrating uncertainty, learning surrogate models with Deep Gaussian Processes, and training a history-conditioned acquisition network. The framework integrates a calibrated Bayesian rank classifier for cheap rank-guided screening, complexity-reduced Deep GP surrogates that decompose epistemic and aleatoric uncertainty, and a lightweight, online-trained acquisition network that estimates hypervolume gains and diversity from optimization traces. Ablations and benchmarks on DTLZ/ZDT and a geothermal reservoir case show consistent gains in HV and Pareto proximity under limited evaluations, due to staged screening, amortized updates, and the synergy among components. The work provides practical calibration guarantees, scalability to high-dimensional problems, and a clear workflow for deploying uncertainty-aware, surrogate-assisted MOBO in real-world expensive settings.

Abstract

The pursuit of optimal trade-offs in high-dimensional search spaces under stringent computational constraints poses a fundamental challenge for contemporary multi-objective optimization. We develop NeuroPareto, a cohesive architecture that integrates rank-centric filtering, uncertainty disentanglement, and history-conditioned acquisition strategies to navigate complex objective landscapes. A calibrated Bayesian classifier estimates epistemic uncertainty across non-domination tiers, enabling rapid generation of high-quality candidates with minimal evaluation cost. Deep Gaussian Process surrogates further separate predictive uncertainty into reducible and irreducible components, providing refined predictive means and risk-aware signals for downstream selection. A lightweight acquisition network, trained online from historical hypervolume improvements, guides expensive evaluations toward regions balancing convergence and diversity. With hierarchical screening and amortized surrogate updates, the method maintains accuracy while keeping computational overhead low. Experiments on DTLZ and ZDT suites and a subsurface energy extraction task show that NeuroPareto consistently outperforms classifier-enhanced and surrogate-assisted baselines in Pareto proximity and hypervolume.

NeuroPareto: Calibrated Acquisition for Costly Many-Goal Search in Vast Parameter Spaces

TL;DR

NeuroPareto tackles expensive, high-dimensional multi-objective optimization by jointly calibrating uncertainty, learning surrogate models with Deep Gaussian Processes, and training a history-conditioned acquisition network. The framework integrates a calibrated Bayesian rank classifier for cheap rank-guided screening, complexity-reduced Deep GP surrogates that decompose epistemic and aleatoric uncertainty, and a lightweight, online-trained acquisition network that estimates hypervolume gains and diversity from optimization traces. Ablations and benchmarks on DTLZ/ZDT and a geothermal reservoir case show consistent gains in HV and Pareto proximity under limited evaluations, due to staged screening, amortized updates, and the synergy among components. The work provides practical calibration guarantees, scalability to high-dimensional problems, and a clear workflow for deploying uncertainty-aware, surrogate-assisted MOBO in real-world expensive settings.

Abstract

The pursuit of optimal trade-offs in high-dimensional search spaces under stringent computational constraints poses a fundamental challenge for contemporary multi-objective optimization. We develop NeuroPareto, a cohesive architecture that integrates rank-centric filtering, uncertainty disentanglement, and history-conditioned acquisition strategies to navigate complex objective landscapes. A calibrated Bayesian classifier estimates epistemic uncertainty across non-domination tiers, enabling rapid generation of high-quality candidates with minimal evaluation cost. Deep Gaussian Process surrogates further separate predictive uncertainty into reducible and irreducible components, providing refined predictive means and risk-aware signals for downstream selection. A lightweight acquisition network, trained online from historical hypervolume improvements, guides expensive evaluations toward regions balancing convergence and diversity. With hierarchical screening and amortized surrogate updates, the method maintains accuracy while keeping computational overhead low. Experiments on DTLZ and ZDT suites and a subsurface energy extraction task show that NeuroPareto consistently outperforms classifier-enhanced and surrogate-assisted baselines in Pareto proximity and hypervolume.
Paper Structure (109 sections, 10 theorems, 85 equations, 19 figures, 15 tables, 1 algorithm)

This paper contains 109 sections, 10 theorems, 85 equations, 19 figures, 15 tables, 1 algorithm.

Key Result

Proposition 15.1

Under sub-Gaussian tail assumptions on logits and Lipschitz continuity of the softmax map, the expected calibration error satisfies where $S_0$ is the baseline MC sample count used to estimate whether additional passes are needed, $S_{\max}$ is the allowed maximum, $N_{\mathrm{calib}}$ is the calibration set size used to fit $T$, and $\epsilon_{\mathrm{model}}$ quantifies the mismatch between the

Figures (19)

  • Figure 1: Overview of the NeuroPareto framework for high-dimensional, budget-constrained multi-objective optimization. The pipeline consists of three synergistic modules: The Bayesian Rank Classifier ($g_{\theta}$) screens a massive candidate pool $\mathcal{C}{\mathrm{rank}}$ using temperature-calibrated softmax $\tilde{p}k$ and adaptive MC dropout to quantify classifier epistemic uncertainty $u{\mathrm{ep}}^{\mathrm{clf}}$. The Complexity-Reduced Deep GP pipeline processes the filtered subset, utilizing sparse variational inference with inducing variables $\mathbf{u}$ and Randomized Fourier Features (RFF) to output predictive objective means $\hat{\mathbf{f}}(\mathbf{x})$ alongside decomposed epistemic $u{\mathrm{ep}}^{\mathrm{gp}}$ and aleatoric $u_{\mathrm{al}}^{\mathrm{gp}}$ variances. The History-Aware Acquisition Network ($a_{\psi}$) aggregates these signals with sliding window statistics ($\mu_{\Delta\mathrm{HV}}, \sigma_{\Delta\mathrm{HV}}$) to predict hypervolume utility $\hat{s}{\mathrm{HV}}$ and diversity $\hat{s}{\mathrm{div}}$. The framework is updated online via a bounded history buffer $B$, ensuring a calibrated exploration-exploitation trade-off while minimizing wall-clock overhead through staged screening and warm-started variational parameters.
  • Figure 2: Sensitivity of final hypervolume to inducing point count $M_{\mathrm{ind}}$ on DTLZ2-100D. Error bars indicate standard error across five random seeds.
  • Figure 3: Convergence curves on Type A problems.
  • Figure 4: Convergence curves on 100D bi-objective problems. The results indicate faster convergence and higher final hypervolume across different problem instances.
  • Figure 5: Geothermal optimization results: (a) approximated Pareto fronts; (b) hypervolume progression over optimization iterations.
  • ...and 14 more figures

Theorems & Definitions (14)

  • Proposition 15.1: Decomposed ECE bound for temperature-scaled MC-Dropout
  • Proposition 16.1: Finite-budget HV lower bound
  • Lemma 16.2: Per-step deficit decomposition
  • proof
  • Lemma 16.3: Telescoping of per-step deficits
  • proof
  • Lemma 16.4: Martingale increment bound and Azuma application
  • proof
  • Proposition 16.5: Pointwise MSE decomposition for Deep GP
  • Proposition 16.6: One-step HV loss
  • ...and 4 more