Table of Contents
Fetching ...

V2VLoc: Robust GNSS-Free Collaborative Perception via LiDAR Localization

Wenkai Lin, Qiming Xia, Wen Li, Xun Huang, Chenglu Wen

TL;DR

This work tackles the challenge of collaborative perception without GNSS signals by leveraging LiDAR-based localization to align multi-agent observations. It introduces a two-module pipeline—PGC for pose and confidence estimation and PASTAT for confidence-aware spatio-temporal alignment—together with the V2VLoc dataset for regression-based localization and collaborative detection. The approach achieves state-of-the-art performance under GNSS-denied conditions on V2VLoc and demonstrates generalization to real-world data (V2V4Real). Key contributions include first application of LiDAR localization to GNSS-free feature alignment in collaboration, a dedicated dataset for GNSS-free collaboration, and ablations validating the effectiveness of PGC and PASTAT. The results suggest robust, bandwidth-efficient collaborative perception in challenging environments.

Abstract

Multi-agents rely on accurate poses to share and align observations, enabling a collaborative perception of the environment. However, traditional GNSS-based localization often fails in GNSS-denied environments, making consistent feature alignment difficult in collaboration. To tackle this challenge, we propose a robust GNSS-free collaborative perception framework based on LiDAR localization. Specifically, we propose a lightweight Pose Generator with Confidence (PGC) to estimate compact pose and confidence representations. To alleviate the effects of localization errors, we further develop the Pose-Aware Spatio-Temporal Alignment Transformer (PASTAT), which performs confidence-aware spatial alignment while capturing essential temporal context. Additionally, we present a new simulation dataset, V2VLoc, which can be adapted for both LiDAR localization and collaborative detection tasks. V2VLoc comprises three subsets: Town1Loc, Town4Loc, and V2VDet. Town1Loc and Town4Loc offer multi-traversal sequences for training in localization tasks, whereas V2VDet is specifically intended for the collaborative detection task. Extensive experiments conducted on the V2VLoc dataset demonstrate that our approach achieves state-of-the-art performance under GNSS-denied conditions. We further conduct extended experiments on the real-world V2V4Real dataset to validate the effectiveness and generalizability of PASTAT.

V2VLoc: Robust GNSS-Free Collaborative Perception via LiDAR Localization

TL;DR

This work tackles the challenge of collaborative perception without GNSS signals by leveraging LiDAR-based localization to align multi-agent observations. It introduces a two-module pipeline—PGC for pose and confidence estimation and PASTAT for confidence-aware spatio-temporal alignment—together with the V2VLoc dataset for regression-based localization and collaborative detection. The approach achieves state-of-the-art performance under GNSS-denied conditions on V2VLoc and demonstrates generalization to real-world data (V2V4Real). Key contributions include first application of LiDAR localization to GNSS-free feature alignment in collaboration, a dedicated dataset for GNSS-free collaboration, and ablations validating the effectiveness of PGC and PASTAT. The results suggest robust, bandwidth-efficient collaborative perception in challenging environments.

Abstract

Multi-agents rely on accurate poses to share and align observations, enabling a collaborative perception of the environment. However, traditional GNSS-based localization often fails in GNSS-denied environments, making consistent feature alignment difficult in collaboration. To tackle this challenge, we propose a robust GNSS-free collaborative perception framework based on LiDAR localization. Specifically, we propose a lightweight Pose Generator with Confidence (PGC) to estimate compact pose and confidence representations. To alleviate the effects of localization errors, we further develop the Pose-Aware Spatio-Temporal Alignment Transformer (PASTAT), which performs confidence-aware spatial alignment while capturing essential temporal context. Additionally, we present a new simulation dataset, V2VLoc, which can be adapted for both LiDAR localization and collaborative detection tasks. V2VLoc comprises three subsets: Town1Loc, Town4Loc, and V2VDet. Town1Loc and Town4Loc offer multi-traversal sequences for training in localization tasks, whereas V2VDet is specifically intended for the collaborative detection task. Extensive experiments conducted on the V2VLoc dataset demonstrate that our approach achieves state-of-the-art performance under GNSS-denied conditions. We further conduct extended experiments on the real-world V2V4Real dataset to validate the effectiveness and generalizability of PASTAT.

Paper Structure

This paper contains 27 sections, 12 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: An illustration of different alignment methods. (a) shows that LiDAR localization achieves coarse alignment successfully. (b) shows that the Graph-Matching method fails without consensus objects.
  • Figure 2: The left figure shows the error distribution of SGLoc sgloc and random noise at a noise level of 8.0 (m); the right shows model performance on V2VDet under LiDAR localization-induced pose errors.
  • Figure 3: Trajectory map of Town1Loc and Town4Loc. The yellow star indicates the starting point of the traversal.
  • Figure 4: The overall architecture of the proposed framework. The architecture consists of two modules: (a) Pose Generator with Confidence (PGC), (b) Pose-Aware Spatio-Temporal Alignment Transformer (PASTAT). The agent obtains the global pose and confidence through PGC, and then aligns the features through PASTAT to obtain the collaborative detection result. RSD: Redundant Sample Downsampling; RPC: Raw LiDAR Point Cloud; WPC: Point Cloud in World coordinate system; CE: Confidence Embedding; TE: Temporal Encoding.