Table of Contents
Fetching ...

Enhancing Safety and Robustness of Vision-Based Controllers via Reachability Analysis

Kaustav Chakraborty, Aryaman Gupta, Somil Bansal

TL;DR

This work compute Neural Reachable Tubes, which act as parameterized approximations of Backward Reachable Tubes to stress-test the vision-based controllers and mine their failure modes, and validates the proposed approaches on an autonomous aircraft taxiing task.

Abstract

Autonomous systems, such as self-driving cars and drones, have made significant strides in recent years by leveraging visual inputs and machine learning for decision-making and control. Despite their impressive performance, these vision-based controllers can make erroneous predictions when faced with novel or out-of-distribution inputs. Such errors can cascade into catastrophic system failures and compromise system safety. In this work, we compute Neural Reachable Tubes, which act as parameterized approximations of Backward Reachable Tubes to stress-test the vision-based controllers and mine their failure modes. The identified failures are then used to enhance the system safety through both offline and online methods. The online approach involves training a classifier as a run-time failure monitor to detect closed-loop, system-level failures, subsequently triggering a fallback controller that robustly handles these detected failures to preserve system safety. For the offline approach, we improve the original controller via incremental training using a carefully augmented failure dataset, resulting in a more robust controller that is resistant to the known failure modes. In either approach, the system is safeguarded against shortcomings that transcend the vision-based controller and pertain to the closed-loop safety of the overall system. We validate the proposed approaches on an autonomous aircraft taxiing task that involves using a vision-based controller to guide the aircraft towards the centerline of the runway. Our results show the efficacy of the proposed algorithms in identifying and handling system-level failures, outperforming methods that rely on controller prediction error or uncertainty quantification for identifying system failures.

Enhancing Safety and Robustness of Vision-Based Controllers via Reachability Analysis

TL;DR

This work compute Neural Reachable Tubes, which act as parameterized approximations of Backward Reachable Tubes to stress-test the vision-based controllers and mine their failure modes, and validates the proposed approaches on an autonomous aircraft taxiing task.

Abstract

Autonomous systems, such as self-driving cars and drones, have made significant strides in recent years by leveraging visual inputs and machine learning for decision-making and control. Despite their impressive performance, these vision-based controllers can make erroneous predictions when faced with novel or out-of-distribution inputs. Such errors can cascade into catastrophic system failures and compromise system safety. In this work, we compute Neural Reachable Tubes, which act as parameterized approximations of Backward Reachable Tubes to stress-test the vision-based controllers and mine their failure modes. The identified failures are then used to enhance the system safety through both offline and online methods. The online approach involves training a classifier as a run-time failure monitor to detect closed-loop, system-level failures, subsequently triggering a fallback controller that robustly handles these detected failures to preserve system safety. For the offline approach, we improve the original controller via incremental training using a carefully augmented failure dataset, resulting in a more robust controller that is resistant to the known failure modes. In either approach, the system is safeguarded against shortcomings that transcend the vision-based controller and pertain to the closed-loop safety of the overall system. We validate the proposed approaches on an autonomous aircraft taxiing task that involves using a vision-based controller to guide the aircraft towards the centerline of the runway. Our results show the efficacy of the proposed algorithms in identifying and handling system-level failures, outperforming methods that rely on controller prediction error or uncertainty quantification for identifying system failures.

Paper Structure

This paper contains 24 sections, 1 theorem, 22 equations, 8 figures, 1 table.

Key Result

Lemma 1

Conformalized Recall. Consider a conformalized classifier, $\hat{\sigma}(I)$, that satisfies eqn. eqn:class_conf_pred for some $\hat{q} \in [0,1]$ and a mapping $\sigma_{\phi}: I \rightarrow\mathbb{R}$, then Recall($\hat{\sigma}$) $\geq 1-\alpha$., where,

Figures (8)

  • Figure 1: (a) The overhead view of the XPlane simulator with the aircraft on the KMWH runway. $p_{x}$, $p_{y}$, $\theta$ denote the state of the aircraft; dashed-white lines show FoV of the camera. The aircraft is required to track the centerline and perform the taxiing task without leaving the runway. Runway simulation images $d_2:$clear and $d_1:$(b)morning, (c)evening, and (d)night, and $d_2:$overcast and $d_1:$(e)morning, (f)evening, and (g)night.
  • Figure 2: Closed-loop TaxiNet NRT slices over different values of $p_x$ and $\theta$ for a fixed $p_y=110m$ with variation of parameters $d_{i}s$ (top row) and a corresponding sample image (bottom row)(a)$d_1 =$morning, $d_2 =$clear, (b)$d_1 =$evening, $d_2 =$clear, (c)$d_1 =$night, $d_2 =$clear and (d)$d_1 =$morning, $d_2 =$overcast, for a fixed initial starting $p_{y}=110m$. Visually similar downsampled images (bottom row inset) for (a)$d_1 =$morning, $d_2 =$clear and (d)$d_1 =$morning, $d_2 =$overcast
  • Figure 3: (a) The overlaid NRT s for $d_1=$night (blue) on $d_1=$morning (cyan) for $p_y = 110m$. The state of interest, shown with a yellow star, is only contained in the morning NRT and not in the night NRT. (b) Top-view of the runway in the morning. The trajectory, "A" to "C", followed by the aircraft under the CNN policy (cyan line), takes it off the runway in the morning. The trajectory (blue line) from "A" to "B" is followed at night.(c) The runway marking in the image, which acts as a failure mode, can be vividly seen by the CNN at location "A" in the morning cannot be seen clearly at night(d) due to poor illumination. (e) The overlaid NRT s for $d_1=$morning (cyan) on $d_1=$night (blue) for $p_y$ = 190m. The state, shown with a yellow star, is only included in the night NRT. (f) Top view of the runway. In the morning, the CNN policy accomplishes the taxiing task by taking the cyan trajectory from "A" (yellow star in (a)) to "C." At night, the policy takes the aircraft outside the runway along the blue trajectory from "A" to "B". (g) The centreline in the image can be seen clearly by the CNN at location "A" in the morning, whereas it cannot be seen at night(h) due to poor illumination.
  • Figure 4: (Left) Variation of $\hat{q}$ on changing the values of $\alpha$. (Right) ROC plot on unseen test environment.
  • Figure 5: Some of the failures detected by FD. (a, b) Images correspond to the aircraft being close to the runway boundaries (highlighted with the purple bounding boxes).(c, d) TaxiNet confuses the runway markings (highlighted with the blue bounding boxes) with the centerline and ultimately leads to a system failure. (e, f) Image (f) is (accurately) not classified as a failure during the night time (the same image is classified as a failure during the day, shown in (e)), as the runway lights (highlighted with the yellow bounding boxes) help TaxiNet predict its position accurately and thereby avoid failure.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Lemma
  • Remark
  • Remark