Table of Contents
Fetching ...

TCAFF: Temporal Consistency for Robot Frame Alignment

Mason B. Peterson, Parker C. Lusk, Antonio Avila, Jonathan P. How

TL;DR

The paper addresses frame alignment between neighboring robots in GPS-denied environments by introducing TCAFF, a temporal-consistency-driven, multi-hypothesis framework that operates on sparse open-set object maps to estimate and refine the relative transform between odometry frames. It combines an enhanced open-set data association (MNO-CLIPPER) with a MAP-based frame-alignment filter and Kalman updates to maintain multiple hypotheses, update with measurements over time, and reject temporally inconsistent candidates, even without an initial pose guess. The key contributions include the frame-alignment rejection mechanism using temporal consistency, the multi-hypothesis frame-alignment filter, hardware demonstrations with four robots tracking six pedestrians achieving frame-alignment errors close to ground-truth, and release of code and hardware dataset. The approach enables real-time collaborative localization and object tracking in indoor and outdoor settings, reducing the need for global localization while maintaining high tracking accuracy.

Abstract

In the field of collaborative robotics, the ability to communicate spatial information like planned trajectories and shared environment information is crucial. When no global position information is available (e.g., indoor or GPS-denied environments), agents must align their coordinate frames before shared spatial information can be properly expressed and interpreted. Coordinate frame alignment is particularly difficult when robots have no initial alignment and are affected by odometry drift. To this end, we develop a novel multiple hypothesis algorithm, called TCAFF, for aligning the coordinate frames of neighboring robots. TCAFF considers potential alignments from associating sparse open-set object maps and leverages temporal consistency to determine an initial alignment and correct for drift, all without any initial knowledge of neighboring robot poses. We demonstrate TCAFF being used for frame alignment in a collaborative object tracking application on a team of four robots tracking six pedestrians and show that TCAFF enables robots to achieve a tracking accuracy similar to that of a system with ground truth localization. The code and hardware dataset are available at https://github.com/mit-acl/tcaff.

TCAFF: Temporal Consistency for Robot Frame Alignment

TL;DR

The paper addresses frame alignment between neighboring robots in GPS-denied environments by introducing TCAFF, a temporal-consistency-driven, multi-hypothesis framework that operates on sparse open-set object maps to estimate and refine the relative transform between odometry frames. It combines an enhanced open-set data association (MNO-CLIPPER) with a MAP-based frame-alignment filter and Kalman updates to maintain multiple hypotheses, update with measurements over time, and reject temporally inconsistent candidates, even without an initial pose guess. The key contributions include the frame-alignment rejection mechanism using temporal consistency, the multi-hypothesis frame-alignment filter, hardware demonstrations with four robots tracking six pedestrians achieving frame-alignment errors close to ground-truth, and release of code and hardware dataset. The approach enables real-time collaborative localization and object tracking in indoor and outdoor settings, reducing the need for global localization while maintaining high tracking accuracy.

Abstract

In the field of collaborative robotics, the ability to communicate spatial information like planned trajectories and shared environment information is crucial. When no global position information is available (e.g., indoor or GPS-denied environments), agents must align their coordinate frames before shared spatial information can be properly expressed and interpreted. Coordinate frame alignment is particularly difficult when robots have no initial alignment and are affected by odometry drift. To this end, we develop a novel multiple hypothesis algorithm, called TCAFF, for aligning the coordinate frames of neighboring robots. TCAFF considers potential alignments from associating sparse open-set object maps and leverages temporal consistency to determine an initial alignment and correct for drift, all without any initial knowledge of neighboring robot poses. We demonstrate TCAFF being used for frame alignment in a collaborative object tracking application on a team of four robots tracking six pedestrians and show that TCAFF enables robots to achieve a tracking accuracy similar to that of a system with ground truth localization. The code and hardware dataset are available at https://github.com/mit-acl/tcaff.
Paper Structure (13 sections, 5 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 5 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: To perform reliable frame alignment, pairs of robots use RGBD camera input to create sparse, open-set object maps (top left). Likely potential associations are then computed and used to determine a set of possible frame alignment rotations and translations (right). Finally, TCAFF considers multiple alignment hypotheses and identifies the current frame alignment with the greatest temporal consistency (bottom left).
  • Figure 2: Visualization of TCAFF multiple hypothesis process. (a) A new set of measurements is computed. (b) Leaf nodes are extended by applying Kalman Filter updates with candidate measurements. (c) Window is slid forward and unlikely branches are pruned.
  • Figure 3: TCAFF is visualized by plotting the frame alignment measurements from MNO-CLIPPER in blue along with the ground truth and TCAFF frame alignment estimate. Each blue dot represents a frame alignment measurement $\mathbf{{\mathbf z}}(k) = [x, y, \theta]^\top$, with the $x$, $y$, and $\theta$ axes shown in separate plots. TCAFF correctly recognizes when enough temporally consistent measurements are received to verify a correct frame alignment. The ground truth frame alignment disappears in the middle of the run when one robot leaves the VICON room and its ground truth pose is unavailable. While in separate areas, map information is still exchanged, and MNO-CLIPPER can be used to find alignments between the two maps, but TCAFF rejects these temporally inconsistent measurements.
  • Figure 4: Four robots tracking six pedestrians in our motion capture space (left) with visualization of the localization and tracking estimates (right). Further qualitative results are shown in the supplementary video material.
  • Figure 5: Comparison of MOTA results. Results for using a single CLIPPER solution that requires a set minimum number of associations are shown for a sweep of different parameter values. TCAFF is able to consider all potential alignments and reject incorrect frame alignments by leveraging temporal consistency, resulting in a higher object tracking accuracy.
  • ...and 1 more figures