Table of Contents
Fetching ...

Perception with Guarantees: Certified Pose Estimation via Reachability Analysis

Tobias Ladner, Yasser Shoukry, Matthias Althoff

TL;DR

This work tackles the challenge of obtaining formally guaranteed 3D pose estimates from vision data in safety-critical settings by combining reachability-based image enclosures with formal neural-network verification. By modeling uncertainties with matrix polynomial zonotopes and leveraging a preprocessing-online refinement pipeline, the method yields a certified pose set $\mathcal{E}^*$ that contains the true pose $\xi^*$ given an image $I^*$ and a known target geometry $\mathcal{T}$. The key contributions are (i) sound image enclosure from uncertain poses, (ii) a two-phase image-to-pose procedure with offline candidate precomputation and online refinement via witness pixels and preimage enclosures, and (iii) extensive experiments on synthetic and real data demonstrating tight certificates and practical runtime. The approach advances safety guarantees for vision-based perception in CPS, enabling certified localization even when external sensors or services are unreliable, and lays groundwork for real-time implementations with further optimization and hardware acceleration.

Abstract

Agents in cyber-physical systems are increasingly entrusted with safety-critical tasks. Ensuring safety of these agents often requires localizing the pose for subsequent actions. Pose estimates can, e.g., be obtained from various combinations of lidar sensors, cameras, and external services such as GPS. Crucially, in safety-critical domains, a rough estimate is insufficient to formally determine safety, i.e., guaranteeing safety even in the worst-case scenario, and external services might additionally not be trustworthy. We address this problem by presenting a certified pose estimation in 3D solely from a camera image and a well-known target geometry. This is realized by formally bounding the pose, which is computed by leveraging recent results from reachability analysis and formal neural network verification. Our experiments demonstrate that our approach efficiently and accurately localizes agents in both synthetic and real-world experiments.

Perception with Guarantees: Certified Pose Estimation via Reachability Analysis

TL;DR

This work tackles the challenge of obtaining formally guaranteed 3D pose estimates from vision data in safety-critical settings by combining reachability-based image enclosures with formal neural-network verification. By modeling uncertainties with matrix polynomial zonotopes and leveraging a preprocessing-online refinement pipeline, the method yields a certified pose set that contains the true pose given an image and a known target geometry . The key contributions are (i) sound image enclosure from uncertain poses, (ii) a two-phase image-to-pose procedure with offline candidate precomputation and online refinement via witness pixels and preimage enclosures, and (iii) extensive experiments on synthetic and real data demonstrating tight certificates and practical runtime. The approach advances safety guarantees for vision-based perception in CPS, enabling certified localization even when external sensors or services are unreliable, and lays groundwork for real-time implementations with further optimization and hardware acceleration.

Abstract

Agents in cyber-physical systems are increasingly entrusted with safety-critical tasks. Ensuring safety of these agents often requires localizing the pose for subsequent actions. Pose estimates can, e.g., be obtained from various combinations of lidar sensors, cameras, and external services such as GPS. Crucially, in safety-critical domains, a rough estimate is insufficient to formally determine safety, i.e., guaranteeing safety even in the worst-case scenario, and external services might additionally not be trustworthy. We address this problem by presenting a certified pose estimation in 3D solely from a camera image and a well-known target geometry. This is realized by formally bounding the pose, which is computed by leveraging recent results from reachability analysis and formal neural network verification. Our experiments demonstrate that our approach efficiently and accurately localizes agents in both synthetic and real-world experiments.
Paper Structure (29 sections, 7 theorems, 42 equations, 15 figures, 3 tables, 3 algorithms)

This paper contains 29 sections, 7 theorems, 42 equations, 15 figures, 3 tables, 3 algorithms.

Key Result

Proposition 1

[proposition]prop:image-enclosure Given a set $\mathcal{X}\subset\mathbb{R}^n$ and a function $f\colon\mathbb{R}^n\rightarrow\mathbb{R}^m$, computes an outer-approximative output set. We provide details on $\opEnclose{\cdot}{\cdot}$ in sec:pZ.

Figures (15)

  • Figure 1: (a) Runway markings as by ICAO standard icao2018Aerodrome. (b) Runway markings observed from a landing plane runwaywebsite.
  • Figure 2: Example transformation given a target $\mathcal{T}$ and a pose $\xi$. The goal of this work is the inverse: How to obtain a certified estimate of $\xi$ given the final image and $\mathcal{T}^\text{TCF}$?
  • Figure 3: Overview: (a) Given a concrete image $I^*$ ( ) and a pose candidate $\mathcal{E}_c$ ( ), we check whether $I^*$ is contained in the outer-approximative image $\widehat{I}_{\mathcal{E}_c}$ ( ). (b) If not, $\mathcal{E}_c$ is discarded as $\xi^*\not\in\mathcal{E}_c$ then. Otherwise, we exploit the relative position of the vertices in $I^*$ ( ) within the uncertain vertices ( ) of $\mathcal{E}_c$ -- which are a byproduct of computing $\widehat{I}_{\mathcal{E}_c}$ -- to (c) impose constraints ( ) on $\mathcal{E}_c$, which -- when computed for all vertices -- obtains us the certified pose estimate $\mathcal{E}^*$ ( ).
  • Figure 4: Computation of the outer-approximative image for a target with a single polygon ($\rho=1$) and an uncertain pose $\mathcal{E}$ with perturbed angles $\theta_x,\theta_y,\theta_z$ by $10$ degrees: (a) Uncertain vertices $\mathcal{V}_{1}^\text{PCF}$ enclose the vertices of the random samples. (b) Convex hull computation over $\mathcal{V}_{1}^\text{PCF}$. (c) Outer-approximative image $\widehat{I}_\mathcal{E}$.
  • Figure 5: Certified pose estimation (\ref{['alg:image-to-pose']}): (a) Given an image $I^*$ and a (simple) target $\mathcal{T}^\text{TCF}$ ($\rho=1$), (b) the vertices of the (unobservable) target $\mathcal{T}^\text{PCF}$ are contained in the witness pixels $\mathcal{Q}_{c,1,k}$, $k\in[\nu_1]$, enabling us to impose constraints $C_{c,1,k}\leq d_{c,1,k}$ on a pose candidate $\mathcal{E}_c\subseteq\Xi$ via the shared latent space to (c) obtain a tight certified estimate of $\xi^*\in\mathcal{E}_c|_{C_c\leq d_c}$.
  • ...and 10 more figures

Theorems & Definitions (18)

  • Definition 1: Polygon
  • Definition 2: Target Object
  • Definition 3: Pinhole Camera Model hartley2003multiple
  • Definition 4: Noise Budget
  • Definition 5: Matrix Polynomial Zonotope ladner2025formal
  • Proposition 1: Image Enclosure kochdumper2022open
  • Proposition 2: Pose to Image
  • proof
  • Lemma 1: Preimage Enclosure koller2025out
  • proof
  • ...and 8 more