Table of Contents
Fetching ...

Test-Time Certifiable Self-Supervision to Bridge the Sim2Real Gap in Event-Based Satellite Pose Estimation

Mohsi Jawaid, Rajat Talak, Yasir Latif, Luca Carlone, Tat-Jun Chin

TL;DR

A test-time self-supervision scheme with a certifier module that improves the pose estimates and closes the Sim2Real gap in event-based satellite pose estimation, and outperforms established test-time adaptation schemes.

Abstract

Deep learning plays a critical role in vision-based satellite pose estimation. However, the scarcity of real data from the space environment means that deep models need to be trained using synthetic data, which raises the Sim2Real domain gap problem. A major cause of the Sim2Real gap are novel lighting conditions encountered during test time. Event sensors have been shown to provide some robustness against lighting variations in vision-based pose estimation. However, challenging lighting conditions due to strong directional light can still cause undesirable effects in the output of commercial off-the-shelf event sensors, such as noisy/spurious events and inhomogeneous event densities on the object. Such effects are non-trivial to simulate in software, thus leading to Sim2Real gap in the event domain. To close the Sim2Real gap in event-based satellite pose estimation, the paper proposes a test-time self-supervision scheme with a certifier module. Self-supervision is enabled by an optimisation routine that aligns a dense point cloud of the predicted satellite pose with the event data to attempt to rectify the inaccurately estimated pose. The certifier attempts to verify the corrected pose, and only certified test-time inputs are backpropagated via implicit differentiation to refine the predicted landmarks, thus improving the pose estimates and closing the Sim2Real gap. Results show that the our method outperforms established test-time adaptation schemes.

Test-Time Certifiable Self-Supervision to Bridge the Sim2Real Gap in Event-Based Satellite Pose Estimation

TL;DR

A test-time self-supervision scheme with a certifier module that improves the pose estimates and closes the Sim2Real gap in event-based satellite pose estimation, and outperforms established test-time adaptation schemes.

Abstract

Deep learning plays a critical role in vision-based satellite pose estimation. However, the scarcity of real data from the space environment means that deep models need to be trained using synthetic data, which raises the Sim2Real domain gap problem. A major cause of the Sim2Real gap are novel lighting conditions encountered during test time. Event sensors have been shown to provide some robustness against lighting variations in vision-based pose estimation. However, challenging lighting conditions due to strong directional light can still cause undesirable effects in the output of commercial off-the-shelf event sensors, such as noisy/spurious events and inhomogeneous event densities on the object. Such effects are non-trivial to simulate in software, thus leading to Sim2Real gap in the event domain. To close the Sim2Real gap in event-based satellite pose estimation, the paper proposes a test-time self-supervision scheme with a certifier module. Self-supervision is enabled by an optimisation routine that aligns a dense point cloud of the predicted satellite pose with the event data to attempt to rectify the inaccurately estimated pose. The certifier attempts to verify the corrected pose, and only certified test-time inputs are backpropagated via implicit differentiation to refine the predicted landmarks, thus improving the pose estimates and closing the Sim2Real gap. Results show that the our method outperforms established test-time adaptation schemes.
Paper Structure (35 sections, 11 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 35 sections, 11 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: RGB frames (top) and corresponding event frames (bottom) of a textureless 3D printed satellite-like object (see Sec. \ref{['sec:customsat']} for justifications of using a textureless object) under different lighting conditions. The first column are synthetic, while the others are real. Observe that the real event frames under neutral and low are similar to the synthetic event frame, despite visible lighting variations in RGB. However, under harsh directional lighting, numerous events distributed non-uniformly on the object were generated by the event sensor. See also Sec. \ref{['sec:datasets']} on sensor tuning.
  • Figure 2: (a) Real event frame captured under harsh lighting with the object detection bounding-box (blue) and projected wireframe (green) of the CAD model before test-time certifiable self-supervision. (b) Same event frame as (a) but with the projected wireframe after the test-time self-supervision.
  • Figure 3: Our test-time certifiable self-supervised for satellite pose estimation from event frames.
  • Figure 4: Alignment of $\mathbf{E}_{2D}$ (black) and $\mathbf{C}_{2D}$ (green) for a non-certified instance (left) and a certified instance (right). The hausdorff distance $\mathcal{H}(\mathbf{E}_{2D},\mathbf{C}_{2D})$ is larger for the alignment on the left than the right.
  • Figure 5: Zoomed in view to demonstrate the difference between RGB frames captured with motionpriority and exposurepriority settings.
  • ...and 2 more figures