Table of Contents
Fetching ...

High-dimensional Level Set Estimation with Trust Regions and Double Acquisition Functions

Giang Ngo, Dat Phan Trong, Dang Nguyen, Sunil Gupta

TL;DR

This work tackles high-dimensional level-set estimation for expensive black-box functions by introducing TRLSE, a multi-trust-region framework that jointly leverages a global acquisition function to locate the threshold boundary and local acquisition functions to refine it within regions. The method provides theoretical guarantees for classification accuracy outside the trust regions and demonstrates superior sample efficiency over baselines on synthetic and real-world problems up to 1000 dimensions. Empirically, TRLSE achieves competitive or superior F1-scores while maintaining reasonable runtimes, and ablation studies highlight the importance of boundary-centered TR updates, local GP modeling, and informed region reinitialization. The approach offers a scalable, theoretically grounded solution for HDLSE with practical impact on applications requiring accurate level-set delineation under costly evaluations.

Abstract

Level set estimation (LSE) classifies whether an unknown function's value exceeds a specified threshold for given inputs, a fundamental problem in many real-world applications. In active learning settings with limited initial data, we aim to iteratively acquire informative points to construct an accurate classifier for this task. In high-dimensional spaces, this becomes challenging where the search volume grows exponentially with increasing dimensionality. We propose TRLSE, an algorithm for high-dimensional LSE, which identifies and refines regions near the threshold boundary with dual acquisition functions operating at both global and local levels. We provide a theoretical analysis of TRLSE's accuracy and show its superior sample efficiency against existing methods through extensive evaluations on multiple synthetic and real-world LSE problems.

High-dimensional Level Set Estimation with Trust Regions and Double Acquisition Functions

TL;DR

This work tackles high-dimensional level-set estimation for expensive black-box functions by introducing TRLSE, a multi-trust-region framework that jointly leverages a global acquisition function to locate the threshold boundary and local acquisition functions to refine it within regions. The method provides theoretical guarantees for classification accuracy outside the trust regions and demonstrates superior sample efficiency over baselines on synthetic and real-world problems up to 1000 dimensions. Empirically, TRLSE achieves competitive or superior F1-scores while maintaining reasonable runtimes, and ablation studies highlight the importance of boundary-centered TR updates, local GP modeling, and informed region reinitialization. The approach offers a scalable, theoretically grounded solution for HDLSE with practical impact on applications requiring accurate level-set delineation under costly evaluations.

Abstract

Level set estimation (LSE) classifies whether an unknown function's value exceeds a specified threshold for given inputs, a fundamental problem in many real-world applications. In active learning settings with limited initial data, we aim to iteratively acquire informative points to construct an accurate classifier for this task. In high-dimensional spaces, this becomes challenging where the search volume grows exponentially with increasing dimensionality. We propose TRLSE, an algorithm for high-dimensional LSE, which identifies and refines regions near the threshold boundary with dual acquisition functions operating at both global and local levels. We provide a theoretical analysis of TRLSE's accuracy and show its superior sample efficiency against existing methods through extensive evaluations on multiple synthetic and real-world LSE problems.
Paper Structure (48 sections, 6 theorems, 21 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 48 sections, 6 theorems, 21 equations, 11 figures, 5 tables, 1 algorithm.

Key Result

Lemma 4.1

Let $\beta\geq\Phi^{-1}(\psi)$ with $\psi\in(0.5,1)$ and $S(u)=2/(1+\textup{exp}(au-b))$ with $b=\psi a$. Any region $\mathcal{T}^i$ where either $\Bar{l}_{t-1}^i\geq h$ or $\Bar{u}_{t-1}^i\leq h$ and remains so for the next iterations will be replaced after at most $\zeta$ iterations with $\zeta=\l

Figures (11)

  • Figure 1: An illustration of Assumption \ref{['assumption:non_negativity_globallocal']} on AA33D. The global acquisition values at the local points remain non-negative. Global values constantly decrease with more points sampled as shown in the proof for Theorem \ref{['main_theorem']}, while local values often spike when new uncertain areas are explored.
  • Figure 2: Results on synthetic functions and real-world benchmark problems
  • Figure 3: Ablation studies varying $V_{init}$ and $R$
  • Figure 4: Ablation studies with: 1) random new TRs, and 2) only the global GP for computations.
  • Figure 5: Ablation studies for $S(\cdot)$: (a,b) No volume update + sigmoid variants, and (c,d) linear variants.
  • ...and 6 more figures

Theorems & Definitions (12)

  • Lemma 4.1
  • Definition 4.2
  • Theorem 4.3: Accuracy guarantee of TRLSE
  • Lemma 4.4
  • proof
  • Lemma A.2: Non-negativity of Acquisition Values
  • proof
  • Lemma A.3: Upper bound the average global acquisition value
  • proof
  • Lemma A.4: Linking Acquisition Value to Classification Accuracy
  • ...and 2 more