Table of Contents
Fetching ...

Active Learning for Level Set Estimation Using Randomized Straddle Algorithms

Yu Inatsu, Shion Takeno, Kentaro Kutsukake, Ichiro Takeuchi

TL;DR

This work tackles level set estimation for expensive black-box functions by adopting a Gaussian process surrogate and introducing a randomized straddle acquisition that uses a chi-squared sample for the confidence parameter, removing the need for iteration- or candidate-dependent tuning. Theoretical guarantees are established via the maximum information gain γ_t, yielding bounds on the cumulative loss R_t and per-iteration loss r_t. The approach extends to max-value loss settings over finite and infinite input spaces, with corresponding finite/discretized analyses. Empirical results on synthetic and real data demonstrate competitive performance, establishing practical viability alongside theoretical guarantees and highlighting the method’s applicability to continuous input domains.

Abstract

Level set estimation (LSE), the problem of identifying the set of input points where a function takes value above (or below) a given threshold, is important in practical applications. When the function is expensive-to-evaluate and black-box, the \textit{straddle} algorithm, which is a representative heuristic for LSE based on Gaussian process models, and its extensions having theoretical guarantees have been developed. However, many of existing methods include a confidence parameter $β^{1/2}_t$ that must be specified by the user, and methods that choose $β^{1/2}_t$ heuristically do not provide theoretical guarantees. In contrast, theoretically guaranteed values of $β^{1/2}_t$ need to be increased depending on the number of iterations and candidate points, and are conservative and not good for practical performance. In this study, we propose a novel method, the \textit{randomized straddle} algorithm, in which $β_t$ in the straddle algorithm is replaced by a random sample from the chi-squared distribution with two degrees of freedom. The confidence parameter in the proposed method has the advantages of not needing adjustment, not depending on the number of iterations and candidate points, and not being conservative. Furthermore, we show that the proposed method has theoretical guarantees that depend on the sample complexity and the number of iterations. Finally, we confirm the usefulness of the proposed method through numerical experiments using synthetic and real data.

Active Learning for Level Set Estimation Using Randomized Straddle Algorithms

TL;DR

This work tackles level set estimation for expensive black-box functions by adopting a Gaussian process surrogate and introducing a randomized straddle acquisition that uses a chi-squared sample for the confidence parameter, removing the need for iteration- or candidate-dependent tuning. Theoretical guarantees are established via the maximum information gain γ_t, yielding bounds on the cumulative loss R_t and per-iteration loss r_t. The approach extends to max-value loss settings over finite and infinite input spaces, with corresponding finite/discretized analyses. Empirical results on synthetic and real data demonstrate competitive performance, establishing practical viability alongside theoretical guarantees and highlighting the method’s applicability to continuous input domains.

Abstract

Level set estimation (LSE), the problem of identifying the set of input points where a function takes value above (or below) a given threshold, is important in practical applications. When the function is expensive-to-evaluate and black-box, the \textit{straddle} algorithm, which is a representative heuristic for LSE based on Gaussian process models, and its extensions having theoretical guarantees have been developed. However, many of existing methods include a confidence parameter that must be specified by the user, and methods that choose heuristically do not provide theoretical guarantees. In contrast, theoretically guaranteed values of need to be increased depending on the number of iterations and candidate points, and are conservative and not good for practical performance. In this study, we propose a novel method, the \textit{randomized straddle} algorithm, in which in the straddle algorithm is replaced by a random sample from the chi-squared distribution with two degrees of freedom. The confidence parameter in the proposed method has the advantages of not needing adjustment, not depending on the number of iterations and candidate points, and not being conservative. Furthermore, we show that the proposed method has theoretical guarantees that depend on the sample complexity and the number of iterations. Finally, we confirm the usefulness of the proposed method through numerical experiments using synthetic and real data.
Paper Structure (28 sections, 10 theorems, 79 equations, 5 figures, 2 tables, 3 algorithms)

This paper contains 28 sections, 10 theorems, 79 equations, 5 figures, 2 tables, 3 algorithms.

Key Result

Theorem 4.1

Assume that $f$ follows $\mathcal{G} \mathcal{P} (0, k )$, where $k (\cdot, \cdot)$ is a positive-definite kernel satisfying $k ({\bm x} ,{\bm x} ) \leq 1$ for any ${\bm x} \in \mathcal{X}$. For each $t \geq 1$, let $\beta_t$ be a sample from the chi-squared distribution with two degrees of freedom, where $C_1 = 4/ \log(1+ \sigma^{-2}_{{\rm noise}})$, and the expectation is taken with all random

Figures (5)

  • Figure 1: Comparison of the confidence parameter $\beta^{1/2}_t$ in the randomized straddle and LSE algorithms. The left-hand side figure shows the histogram of $\beta^{1/2}_t$ when $\beta_t$ is sampled 1,000,000 times from the chi-squared distribution with two degrees of freedom. The red line in the center and right figure denotes $\mathbb{E} [\beta^{1/2}_t] = \sqrt{2 \pi }/2 \approx 1.25$, the shaded area denotes the $95 \%$ confidence interval of $\beta^{1/2}_t$, and the black line denotes the theoretical value of $\beta^{1/2}_t$ in the LSE algorithm given by $\beta^{1/2}_t = \sqrt{2 \log (|\mathcal{X}| \pi ^2 t^2 /(6 \delta) ) }$, where $\delta =0.05$. The figure in the center shows the behavior of $\beta^{1/2}_t$ as the number of iterations $t$ increases when the number of candidate points $|\mathcal{X}|$ is fixed at 1000, whereas the figure on the right shows the behavior of $\beta^{1/2}_t$ as the number of candidate points $|\mathcal{X}|$ increases when the number of iterations $t$ is fixed at 100.
  • Figure 2: Comparison of points selected by $a_{t-1} ({ x} )$ with different $\beta^{1/2}_t$. The red line represents the true black-box function $f(x)=5\exp(-(x+5)^2)+5\exp(-(x-5)^2)-2\exp(-x^2)-1$, the black line represents the posterior mean, and the blue crosses represent the observed points. The figures on the left, center and right show the differences for 20 observation points when using, respectively, $\beta^{1/2}_t =1$, $\beta^{1/2}_t =10$ and $\beta_t$ which follows the chi-squared distribution with two degrees of freedom, in the calculation of $a_{t-1} ({ x} )$, where $x=-5$ is chosen as the initial point under the observation noise $\sigma^2_{{\rm noise}} =10^{-2}$ and threshold $\theta =3$. Since ${\rm STR}_{t-1} (x)$ is represented as ${\rm STR}_{t-1} (x) = \beta^{1/2}_t \sigma_{t-1} (x) - | \mu_{t-1} (x) - \theta |$, when $\beta^{1/2}_t =1$, $\beta^{1/2}_t$ is small so the second term of ${\rm STR}_{t-1} (x)$ dominates, and as a result, it can be seen that only values whose posterior mean are close to the threshold are observed. Conversely, when $\beta^{1/2}_t =10$, $\beta^{1/2}_t$ is large so the first term of ${\rm STR}_{t-1} (x)$ dominates, resulting in the AF that is almost the same as uncertainty sampling, and it can be seen that the selected inputs are spaced almost equally apart. On the other hand, these behaviors are not observed with the proposed method.
  • Figure 3: Averages for the loss $r_t$ and ${\rm Fscore}_t$ for each AF over 100 simulations across different settings when the input space is finite. The top row shows $r_t$, and the bottom row shows ${\rm Fscore}_t$. Error bars represent six times the standard error.
  • Figure 4: Averages of the loss $r_t$ and ${\rm Fscore}_t$ for each AF over 100 simulations for each setting when the input space is infinite. The top row shows $r_t$, the bottom row shows ${\rm Fscore}_t$, and each error bar length represents the six times the standard error.
  • Figure 5: Averages of the loss $r_t$ and ${\rm Fscore}_t$ for each AF over 100 simulations using the carrier lifetime data. The left figure shows $r_t$, while the right figure shows ${\rm Fscore}_t$, with error bars representing six times the standard error.

Theorems & Definitions (23)

  • Definition 3.1: Level Set Estimation
  • Definition 3.2: Randomized Straddle
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Definition A.1: Randomized Straddle for Max-value Loss
  • Definition A.2
  • Theorem A.1
  • Theorem A.2
  • Definition A.3
  • ...and 13 more