Table of Contents
Fetching ...

Global Safe Sequential Learning via Efficient Knowledge Transfer

Cen-You Li, Olaf Duennbier, Marc Toussaint, Barbara Rakitsch, Christoph Zimmer

TL;DR

This paper proposes safe transfer sequential learning to accelerate task learning and to expand the explorable safe region by leveraging abundant offline data from a related source task, and provides a theoretical analysis to explain why single-task method cannot cope with disconnected regions.

Abstract

Sequential learning methods, such as active learning and Bayesian optimization, aim to select the most informative data for task learning. In many applications, however, data selection is constrained by unknown safety conditions, motivating the development of safe learning approaches. A promising line of safe learning methods uses Gaussian processes to model safety conditions, restricting data selection to areas with high safety confidence. However, these methods are limited to local exploration around an initial seed dataset, as safety confidence centers around observed data points. As a consequence, task exploration is slowed down and safe regions disconnected from the initial seed dataset remain unexplored. In this paper, we propose safe transfer sequential learning to accelerate task learning and to expand the explorable safe region. By leveraging abundant offline data from a related source task, our approach guides exploration in the target task more effectively. We also provide a theoretical analysis to explain why single-task method cannot cope with disconnected regions. Finally, we introduce a computationally efficient approximation of our method that reduces runtime through pre-computations. Our experiments demonstrate that this approach, compared to state-of-the-art methods, learns tasks with lower data consumption and enhances global exploration across multiple disjoint safe regions, while maintaining comparable computational efficiency.

Global Safe Sequential Learning via Efficient Knowledge Transfer

TL;DR

This paper proposes safe transfer sequential learning to accelerate task learning and to expand the explorable safe region by leveraging abundant offline data from a related source task, and provides a theoretical analysis to explain why single-task method cannot cope with disconnected regions.

Abstract

Sequential learning methods, such as active learning and Bayesian optimization, aim to select the most informative data for task learning. In many applications, however, data selection is constrained by unknown safety conditions, motivating the development of safe learning approaches. A promising line of safe learning methods uses Gaussian processes to model safety conditions, restricting data selection to areas with high safety confidence. However, these methods are limited to local exploration around an initial seed dataset, as safety confidence centers around observed data points. As a consequence, task exploration is slowed down and safe regions disconnected from the initial seed dataset remain unexplored. In this paper, we propose safe transfer sequential learning to accelerate task learning and to expand the explorable safe region. By leveraging abundant offline data from a related source task, our approach guides exploration in the target task more effectively. We also provide a theoretical analysis to explain why single-task method cannot cope with disconnected regions. Finally, we introduce a computationally efficient approximation of our method that reduces runtime through pre-computations. Our experiments demonstrate that this approach, compared to state-of-the-art methods, learns tasks with lower data consumption and enhances global exploration across multiple disjoint safe regions, while maintaining comparable computational efficiency.
Paper Structure (80 sections, 7 theorems, 34 equations, 12 figures, 5 tables, 3 algorithms)

This paper contains 80 sections, 7 theorems, 34 equations, 12 figures, 5 tables, 3 algorithms.

Key Result

Theorem 4.1

We are given $\bm{x}_{1:N} \subseteq \mathcal{X}$. For any safety constraint indexed by $j=1,...,J$, let $z_{1:N}^j \coloneqq (z_1^j, ..., z_N^j )$ be the observed noisy safety values and let $\|(z_1^j, ..., z_N^j)\|\leq \sqrt{N}$. The safety value $z^{j}=q^{j}(\bm{x}) + \epsilon_{q^{j}}$ satisfies

Figures (12)

  • Figure 1: Illustration: Safe sequential learning with transfer (top) and conventional (bottom) learning. The light yellow data points represent source data. The main benefit of transfer learning is to accelerate exploration and identify larger and potentially disjoint safe regions by leveraging the source data.
  • Figure 2: A safety function (shown in black) with two safe regions above threshold zero. Left graphics: Based on the initial data within one of the safe regions, a GP surrogate is trained. The blue line represents the mean prediction, while the blue shaded area indicates the uncertainty (e.g., confidence interval) around the mean. The green area indicates the learned safe area. Right graphics: After exploration, more points are sampled within the first safe region. However, the gap to the second safe region exceeds $r$, preventing the discovery of the second region, rendering the learned safe area almost unchanged. The true safety function used here is $q(x)= \sin\left( 10 x^3 - 5x - 10 \right) + \frac{1}{3} x^2 - \frac{1}{2}$. The observations are with noise drawn from $\mathcal{N}(0, 0.1^2)$.
  • Figure 3: The same safety constraint as in \ref{['figure2-local_safe_exploration']} with two safe regions. Left: the single-task GP cannot reach the right safe region as the distance is greater than the radius $r$. Right: The multitask GP is able to exploit knowledge from the source data and build high safety confidence on the right region. The source data comes from the function $q_s(x)=\sin\left( 10 x^3 - 5x - 10 \right) + \sin( x^2 ) - \frac{1}{2}$ and is shown in yellow.
  • Figure 4: Empirical performance across all six benchmark datasets: RMSE to assess model convergence, TP rate to measure the coverage of the safe space explored, and FP rate to evaluate the safety of each approach. Both TP and FP compute the rates to the area of $\mathcal{X}_{\text{pool}}$. The ground truth safe area portion for each dataset is indicated by a black line in the second column. Our approach generally shows improved convergence in terms of model performance and the extent of explored safe regions, while maintaining safety levels comparable to the baseline SAL. On GEngine, we additionally provide a zoomed-in RMSE figure (\ref{['figure4-gengine_zoomin']}).
  • Figure 5: The RMSE zoom-in version of GEngine in \ref{['main-figure']}.
  • ...and 7 more figures

Theorems & Definitions (19)

  • Remark 3.3
  • Definition 4.0
  • Theorem 4.1: Local exploration of single-output GPs
  • Corollary 4.1: Existence of $\delta$
  • Example 4.2
  • Definition B.0
  • Definition C.1
  • Definition C.2
  • Proposition C.3
  • proof : Proof of \ref{['proposition-Hv_norm']}
  • ...and 9 more