Table of Contents
Fetching ...

Active Calibration of Reachable Sets Using Approximate Pick-to-Learn

Sampada Deglurkar, Ebonye Smith, Jingqi Li, Claire J. Tomlin

Abstract

Reachability computations that rely on learned or estimated models require calibration in order to uphold confidence about their guarantees. Calibration generally involves sampling scenarios inside the reachable set. However, producing reasonable probabilistic guarantees may require many samples, which can be costly. To remedy this, we propose that calibration of reachable sets be performed using active learning strategies. In order to produce a probabilistic guarantee on the active learning, we adapt the Pick-to-Learn algorithm, which produces generalization bounds for standard supervised learning, to the active learning setting. Our method, Approximate Pick-to-Learn, treats the process of choosing data samples as maximizing an approximate error function. We can then use conformal prediction to ensure that the approximate error is close to the true model error. We demonstrate our technique for a simulated drone racing example in which learning is used to provide an initial guess of the reachable tube. Our method requires fewer samples to calibrate the model and provides more accurate sets than the baselines. We simultaneously provide tight generalization bounds.

Active Calibration of Reachable Sets Using Approximate Pick-to-Learn

Abstract

Reachability computations that rely on learned or estimated models require calibration in order to uphold confidence about their guarantees. Calibration generally involves sampling scenarios inside the reachable set. However, producing reasonable probabilistic guarantees may require many samples, which can be costly. To remedy this, we propose that calibration of reachable sets be performed using active learning strategies. In order to produce a probabilistic guarantee on the active learning, we adapt the Pick-to-Learn algorithm, which produces generalization bounds for standard supervised learning, to the active learning setting. Our method, Approximate Pick-to-Learn, treats the process of choosing data samples as maximizing an approximate error function. We can then use conformal prediction to ensure that the approximate error is close to the true model error. We demonstrate our technique for a simulated drone racing example in which learning is used to provide an initial guess of the reachable tube. Our method requires fewer samples to calibrate the model and provides more accurate sets than the baselines. We simultaneously provide tight generalization bounds.

Paper Structure

This paper contains 14 sections, 2 theorems, 3 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{A}$ be Algorithm alg:approx_picktolearn, such that $\mathcal{A}(D) = (h, Q)$. Then, for any $\delta \in (0, 1)$ and $\alpha \in (0, 1)$. $\bar{\epsilon}(|Q|, \delta)$ is calculated as in Pick-to-Learn.

Figures (3)

  • Figure 1: (a) Our framework provides a probabilistic generalization bound for the iterative process of learning and actively sampling the next data point. The novelty of our method lies in the realization that if we conformally calibrate our active sampling strategy so that it approximates maximizing the true model error, we can achieve a desired probabilistic guarantee. (b) We demonstrate our method on a drone racing example in which the ego drone (green) overtakes another drone (yellow). We display the reach-avoid set learned by our method, with the lighter colors indicative of the set at previous iterations and the darker colors for later iterations. The orange points are samples taken by our calibrated active learning strategy.
  • Figure 2: (a) In this 2-dimensional "Slice 1", the ego drone's 3D velocity is set to $[0.0, 0.7, 0.0]$ and its altitude is $0.0$. The other drone's 3D spatial coordinates are $[0.4, -2.2, 0.0]$ and its 3D velocity is $[0.0, 0.3, 0.0]$. (b) In this 2-dimensional "Slice 2", the other drone's state is the same, but the ego drone's velocity is $[0.0, 0.0, -0.5]$ and its altitude is $0.05$. The baselines' calibration technique is to simply choose an appropriate level of the learned value function. For the more commonly seen scenario (a), the level sets have more regular shapes, better justifying this calibration technique. However, for the less commonly seen scenario (b), this is not the case, and it is more reasonable to be unconstrained by learned set geometries, as in our method. In both figures, orange points are the samples that our method took.
  • Figure 3: Plots comparing the number of samples and FPR/FNR across methods. "Boundary" denotes our boundary sampling active learning technique, while "Random" refers to setting $a_{h,\eta}(x)$ to random values. In 3D experiments, the slice of the dynamics is the same as that for the 2D Slice 1 except the ego drone's altitude is also allowed to vary. In 4D experiments, we additionally allow the ego agent's velocity in the x-direction to vary. The numbers above the bars indicate how many seeds out of 10 were successful for our method. Here, failure is defined as more than 70 iterations of our algorithm, since the method is then no longer adaptive with minimal samples. Comparisons are only made for successful seeds. Our method balances the competing objectives of minimizing the number of samples and FPR and FNR well compared with the baselines. As we expect, performance for all methods is worse for Slice 2 compared with Slice 1.

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Lemma 1
  • proof