Table of Contents
Fetching ...

Efficient Robot Learning for Perception and Mapping

Niclas Vödisch

TL;DR

Problem: robust robotic perception and mapping in unseen environments with minimal human labeling. Approach: integrate continual learning, label-efficient segmentation using a frozen DINOv2 backbone, and multi-agent panoptic mapping, along with automatic target-less camera–LiDAR calibration and domain-transfer strategies between $\mathcal{S}$ and $\mathcal{T}$. Key contributions include continual SLAM with replay buffers and diversity-driven sampling, cross-domain mixing for panoptic adaptation, ten-shot segmentation with pseudo-labeling, automatic LiDAR calibration, and a centralized hierarchical scene graph for cross-agent fusion. Significance: reduces annotation burden, enables open-world, scalable perception and mapping systems suitable for real-world deployment.

Abstract

Holistic scene understanding poses a fundamental contribution to the autonomous operation of a robotic agent in its environment. Key ingredients include a well-defined representation of the surroundings to capture its spatial structure as well as assigning semantic meaning while delineating individual objects. Classic components from the toolbox of roboticists to address these tasks are simultaneous localization and mapping (SLAM) and panoptic segmentation. Although recent methods demonstrate impressive advances, mostly due to employing deep learning, they commonly utilize in-domain training on large datasets. Since following such a paradigm substantially limits their real-world application, my research investigates how to minimize human effort in deploying perception-based robotic systems to previously unseen environments. In particular, I focus on leveraging continual learning and reducing human annotations for efficient learning. An overview of my work can be found at https://vniclas.github.io.

Efficient Robot Learning for Perception and Mapping

TL;DR

Problem: robust robotic perception and mapping in unseen environments with minimal human labeling. Approach: integrate continual learning, label-efficient segmentation using a frozen DINOv2 backbone, and multi-agent panoptic mapping, along with automatic target-less camera–LiDAR calibration and domain-transfer strategies between and . Key contributions include continual SLAM with replay buffers and diversity-driven sampling, cross-domain mixing for panoptic adaptation, ten-shot segmentation with pseudo-labeling, automatic LiDAR calibration, and a centralized hierarchical scene graph for cross-agent fusion. Significance: reduces annotation burden, enables open-world, scalable perception and mapping systems suitable for real-world deployment.

Abstract

Holistic scene understanding poses a fundamental contribution to the autonomous operation of a robotic agent in its environment. Key ingredients include a well-defined representation of the surroundings to capture its spatial structure as well as assigning semantic meaning while delineating individual objects. Classic components from the toolbox of roboticists to address these tasks are simultaneous localization and mapping (SLAM) and panoptic segmentation. Although recent methods demonstrate impressive advances, mostly due to employing deep learning, they commonly utilize in-domain training on large datasets. Since following such a paradigm substantially limits their real-world application, my research investigates how to minimize human effort in deploying perception-based robotic systems to previously unseen environments. In particular, I focus on leveraging continual learning and reducing human annotations for efficient learning. An overview of my work can be found at https://vniclas.github.io.
Paper Structure (3 sections, 1 figure)

This paper contains 3 sections, 1 figure.

Figures (1)

  • Figure :