Table of Contents
Fetching ...

SHeRLoc: Synchronized Heterogeneous Radar Place Recognition for Cross-Modal Localization

Hanjun Kim, Minwoo Jung, Wooseong Yang, Ayoung Kim

TL;DR

SHeRLoc tackles cross-modal localization across heterogeneous radar types by transforming data into synchronized RCS polar BEV representations and learning rotation-invariant, cross-modal embeddings. It introduces HOLMES, a hierarchical optimal-transport-based descriptor that fuses local RCS patterns with global context under an adaptive entropy-regularized Sinkhorn framework, and couples it with FoV-aware FFT-based data mining and an adaptive margin triplet loss. The approach yields dramatic gains on a public heterogeneous radar dataset, raising recall@1 from below $0.1$ to $0.9$, and demonstrates strong zero-shot generalization and cross-modal applicability to LiDAR. This work enables robust cross-modal place recognition and paves the way for heterogeneous sensor SLAM, with open-source code to accelerate community adoption.

Abstract

Despite the growing adoption of radar in robotics, the majority of research has been confined to homogeneous sensor types, overlooking the integration and cross-modality challenges inherent in heterogeneous radar technologies. This leads to significant difficulties in generalizing across diverse radar data types, with modality-aware approaches that could leverage the complementary strengths of heterogeneous radar remaining unexplored. To bridge these gaps, we propose SHeRLoc, the first deep network tailored for heterogeneous radar, which utilizes RCS polar matching to align multimodal radar data. Our hierarchical optimal transport-based feature aggregation method generates rotationally robust multi-scale descriptors. By employing FFT-similarity-based data mining and adaptive margin-based triplet loss, SHeRLoc enables FOV-aware metric learning. SHeRLoc achieves an order of magnitude improvement in heterogeneous radar place recognition, increasing recall@1 from below 0.1 to 0.9 on a public dataset and outperforming state of-the-art methods. Also applicable to LiDAR, SHeRLoc paves the way for cross-modal place recognition and heterogeneous sensor SLAM. The supplementary materials and source code are available at https://sites.google.com/view/radar-sherloc.

SHeRLoc: Synchronized Heterogeneous Radar Place Recognition for Cross-Modal Localization

TL;DR

SHeRLoc tackles cross-modal localization across heterogeneous radar types by transforming data into synchronized RCS polar BEV representations and learning rotation-invariant, cross-modal embeddings. It introduces HOLMES, a hierarchical optimal-transport-based descriptor that fuses local RCS patterns with global context under an adaptive entropy-regularized Sinkhorn framework, and couples it with FoV-aware FFT-based data mining and an adaptive margin triplet loss. The approach yields dramatic gains on a public heterogeneous radar dataset, raising recall@1 from below to , and demonstrates strong zero-shot generalization and cross-modal applicability to LiDAR. This work enables robust cross-modal place recognition and paves the way for heterogeneous sensor SLAM, with open-source code to accelerate community adoption.

Abstract

Despite the growing adoption of radar in robotics, the majority of research has been confined to homogeneous sensor types, overlooking the integration and cross-modality challenges inherent in heterogeneous radar technologies. This leads to significant difficulties in generalizing across diverse radar data types, with modality-aware approaches that could leverage the complementary strengths of heterogeneous radar remaining unexplored. To bridge these gaps, we propose SHeRLoc, the first deep network tailored for heterogeneous radar, which utilizes RCS polar matching to align multimodal radar data. Our hierarchical optimal transport-based feature aggregation method generates rotationally robust multi-scale descriptors. By employing FFT-similarity-based data mining and adaptive margin-based triplet loss, SHeRLoc enables FOV-aware metric learning. SHeRLoc achieves an order of magnitude improvement in heterogeneous radar place recognition, increasing recall@1 from below 0.1 to 0.9 on a public dataset and outperforming state of-the-art methods. Also applicable to LiDAR, SHeRLoc paves the way for cross-modal place recognition and heterogeneous sensor SLAM. The supplementary materials and source code are available at https://sites.google.com/view/radar-sherloc.

Paper Structure

This paper contains 32 sections, 15 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: SHeRLoc generates $n_v$ views from spinning radar scans to align with the narrow FOV of 4D radar, while bridging the modality gap across heterogeneous radars through RCS polar matching.
  • Figure 2: The overall pipeline of SHeRLoc. RCS polar images $I_{\text{4D}}$ and $I_{\text{spin}}$ are generated from heterogeneous radars and processed through a shared feature extraction network $\mathcal{G}$. Multi-level features $F_M$ and $F_H$ are aggregated into global descriptors $\mathcal{D}$ using HOLME.
  • Figure 3: The pipeline of RCS polar synchronization. The turbo colormap is applied to $I_{\text{4D}}$ and $I_{spin}$ for visualization clarity.
  • Figure 4: For a far but overlapping negative and a nearby but rotated positive, Cartesian BEV yields higher similarity with the negative due to rotation variance. In contrast, polar BEV produces higher similarity with the positive, demonstrating robustness to rotation.
  • Figure 5: Trajectory from Sports Complex, Library, and River Island sequences, with green indicating true matching pairs, highlighting SHeRLoc's robustness in challenging scenarios.
  • ...and 4 more figures