Table of Contents
Fetching ...

One Filter to Deploy Them All: Robust Safety for Quadrupedal Navigation in Unknown Environments

Albert Lin, Shuang Peng, Somil Bansal

TL;DR

This work introduces the Observation-Conditioned Reachability (OCR) safety-filter for quadrupedal navigation in unknown environments. An OCR Value Network (OCR-VN) predicts a safety value function from reduced-order states, disturbance bounds, and LiDAR observations, enabling a single safety layer to support diverse nominal controllers without retraining. Safety is enforced online via online disturbance estimation and a quadratic-program-based adaptive filter that minimally overrides the nominal policy. The approach is validated through extensive simulations and hardware experiments on a Unitree Go1, showing robust safety across multiple controllers and environments, with calibration-based guarantees on the learned safety function. Overall, OCR provides a policy- and environment-agnostic safety mechanism that adapts in real time to dynamics and perception changes, enabling safer deployment of learning-based legged locomotion in the wild.

Abstract

As learning-based methods for legged robots rapidly grow in popularity, it is important that we can provide safety assurances efficiently across different controllers and environments. Existing works either rely on a priori knowledge of the environment and safety constraints to ensure system safety or provide assurances for a specific locomotion policy. To address these limitations, we propose an observation-conditioned reachability-based (OCR) safety-filter framework. Our key idea is to use an OCR value network (OCR-VN) that predicts the optimal control-theoretic safety value function for new failure regions and dynamic uncertainty during deployment time. Specifically, the OCR-VN facilitates rapid safety adaptation through two key components: a LiDAR-based input that allows the dynamic construction of safe regions in light of new obstacles and a disturbance estimation module that accounts for dynamics uncertainty in the wild. The predicted safety value function is used to construct an adaptive safety filter that overrides the nominal quadruped controller when necessary to maintain safety. Through simulation studies and hardware experiments on a Unitree Go1 quadruped, we demonstrate that the proposed framework can automatically safeguard a wide range of hierarchical quadruped controllers, adapts to novel environments, and is robust to unmodeled dynamics without a priori access to the controllers or environments - hence, "One Filter to Deploy Them All". The experiment videos can be found on the project website.

One Filter to Deploy Them All: Robust Safety for Quadrupedal Navigation in Unknown Environments

TL;DR

This work introduces the Observation-Conditioned Reachability (OCR) safety-filter for quadrupedal navigation in unknown environments. An OCR Value Network (OCR-VN) predicts a safety value function from reduced-order states, disturbance bounds, and LiDAR observations, enabling a single safety layer to support diverse nominal controllers without retraining. Safety is enforced online via online disturbance estimation and a quadratic-program-based adaptive filter that minimally overrides the nominal policy. The approach is validated through extensive simulations and hardware experiments on a Unitree Go1, showing robust safety across multiple controllers and environments, with calibration-based guarantees on the learned safety function. Overall, OCR provides a policy- and environment-agnostic safety mechanism that adapts in real time to dynamics and perception changes, enabling safer deployment of learning-based legged locomotion in the wild.

Abstract

As learning-based methods for legged robots rapidly grow in popularity, it is important that we can provide safety assurances efficiently across different controllers and environments. Existing works either rely on a priori knowledge of the environment and safety constraints to ensure system safety or provide assurances for a specific locomotion policy. To address these limitations, we propose an observation-conditioned reachability-based (OCR) safety-filter framework. Our key idea is to use an OCR value network (OCR-VN) that predicts the optimal control-theoretic safety value function for new failure regions and dynamic uncertainty during deployment time. Specifically, the OCR-VN facilitates rapid safety adaptation through two key components: a LiDAR-based input that allows the dynamic construction of safe regions in light of new obstacles and a disturbance estimation module that accounts for dynamics uncertainty in the wild. The predicted safety value function is used to construct an adaptive safety filter that overrides the nominal quadruped controller when necessary to maintain safety. Through simulation studies and hardware experiments on a Unitree Go1 quadruped, we demonstrate that the proposed framework can automatically safeguard a wide range of hierarchical quadruped controllers, adapts to novel environments, and is robust to unmodeled dynamics without a priori access to the controllers or environments - hence, "One Filter to Deploy Them All". The experiment videos can be found on the project website.

Paper Structure

This paper contains 50 sections, 1 theorem, 22 equations, 7 figures, 7 tables.

Key Result

Theorem 1

Compute the number of "outliers" $k$ as: where $\beta$, $\epsilon$, and $N$ are as defined above. Compute the calibration level $\delta$ as the $\frac{N-k}{N}$ quantile of the conformal scores $\{s_i\}_{i=1}^N$. Then, with probability at least $1-\beta$ over the draws of the calibration samples, the following holds:

Figures (7)

  • Figure 2: The OCR framework. (Left) During training, we generate environments with random obstacles and disturbance bounds. The OCR-VN is trained to predict the value function (visualized over a grid) using the disturbance bound, the LiDAR reading, and the state. (Right) During deployment, the OCR-VN is queried with the observed LiDAR reading, the disturbance bound estimated using the most recent state and action history, and the current state estimate to construct an adaptive safety filter.
  • Figure 3: (Left) A LiDAR observation $o^e$ in a validation environment where $\bar{d}^e_{p_{x},p_{y}}=0.82$ m/s, $\bar{d}^e_{p_\theta}=0.56$ rad/s. (Right top-row) The ground-truth value function and its spatial gradients. (Right bottom-row) OCR-VN predictions using $o^e$ and $\bar{d}^e_r$. As shown above, the OCR-VN predictions for the value function and its spatial gradients are highly accurate.
  • Figure 4: The OCR framework (green/red : nominal/filtered) safeguards different nominal controllers navigating to a goal (cyan) in an environment with a payload of $-0.6$ kg and a friction of $0.7$. By themselves (white), the (a) PS + WTW, (b) NVE + WTW, and (c) ABS-Agile controllers fail to maintain safety due to dynamical uncertainty caused by low payload and friction.
  • Figure 5: (In color) Safety rates across settings (1,000 trials).
  • Figure 6: OCR (green/red : nominal/filtered) and ABS (white/black : nominal/filtered) frameworks in (a) a validation environment with a payload of $-0.9$ kg and a friction of $0.5$ and (b) a hand-designed obstacle configuration with a region of low friction (blue). In (b), we plot the evolution of the estimated disturbance bound in position $\bar{d}^e_{p_x,p_y}$, as well as the OCR-VN predictions at two different states (orange, purple).
  • ...and 2 more figures

Theorems & Definitions (3)

  • Theorem 1: Conformal OCR-VN Calibration
  • Remark
  • proof