Table of Contents
Fetching ...

EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

Dongki Jung, Jaehoon Choi, Yonghan Lee, Somi Jeong, Taejae Lee, Dinesh Manocha, Suyong Yeon

TL;DR

EDM addresses the challenge of dense feature matching for omnidirectional ERP images where projection distortions hinder traditional perspective methods. It introduces a distortion-aware dense matcher that operates in a unified spherical framework using a Spherical Spatial Alignment Module and Geodesic Flow Refinement, aided by spherical positional embeddings and bidirectional spherical-cartesian transformations. The method achieves state-of-the-art accuracy on Matterport3D and Stanford2D3D, with $AUC@5deg$ gains of +26.72 and +42.62, and qualitative robustness on EgoNeRF and OmniPhotos. By optimizing angular agreement on the unit sphere and promoting distortion-aware representations, EDM advances practical localization and mapping for omnidirectional imagery. Limitations include indoor, gravity-aligned data bias; future work expands data diversity and downstream tasks such as localization and mapping for omnidirectional imagery.

Abstract

We introduce the first learning-based dense matching algorithm, termed Equirectangular Projection-Oriented Dense Kernelized Feature Matching (EDM), specifically designed for omnidirectional images. Equirectangular projection (ERP) images, with their large fields of view, are particularly suited for dense matching techniques that aim to establish comprehensive correspondences across images. However, ERP images are subject to significant distortions, which we address by leveraging the spherical camera model and geodesic flow refinement in the dense matching method. To further mitigate these distortions, we propose spherical positional embeddings based on 3D Cartesian coordinates of the feature grid. Additionally, our method incorporates bidirectional transformations between spherical and Cartesian coordinate systems during refinement, utilizing a unit sphere to improve matching performance. We demonstrate that our proposed method achieves notable performance enhancements, with improvements of +26.72 and +42.62 in AUC@5° on the Matterport3D and Stanford2D3D datasets.

EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

TL;DR

EDM addresses the challenge of dense feature matching for omnidirectional ERP images where projection distortions hinder traditional perspective methods. It introduces a distortion-aware dense matcher that operates in a unified spherical framework using a Spherical Spatial Alignment Module and Geodesic Flow Refinement, aided by spherical positional embeddings and bidirectional spherical-cartesian transformations. The method achieves state-of-the-art accuracy on Matterport3D and Stanford2D3D, with gains of +26.72 and +42.62, and qualitative robustness on EgoNeRF and OmniPhotos. By optimizing angular agreement on the unit sphere and promoting distortion-aware representations, EDM advances practical localization and mapping for omnidirectional imagery. Limitations include indoor, gravity-aligned data bias; future work expands data diversity and downstream tasks such as localization and mapping for omnidirectional imagery.

Abstract

We introduce the first learning-based dense matching algorithm, termed Equirectangular Projection-Oriented Dense Kernelized Feature Matching (EDM), specifically designed for omnidirectional images. Equirectangular projection (ERP) images, with their large fields of view, are particularly suited for dense matching techniques that aim to establish comprehensive correspondences across images. However, ERP images are subject to significant distortions, which we address by leveraging the spherical camera model and geodesic flow refinement in the dense matching method. To further mitigate these distortions, we propose spherical positional embeddings based on 3D Cartesian coordinates of the feature grid. Additionally, our method incorporates bidirectional transformations between spherical and Cartesian coordinate systems during refinement, utilizing a unit sphere to improve matching performance. We demonstrate that our proposed method achieves notable performance enhancements, with improvements of +26.72 and +42.62 in AUC@5° on the Matterport3D and Stanford2D3D datasets.

Paper Structure

This paper contains 23 sections, 16 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: (a) Previous state-of-the-art edstedt2023dkm struggles to achieve accurate dense matching in equirectangular projection (ERP) images due to inherent distortions. (b) The ERP image can be transformed into a cubemap image, which consists of six perspective images. However, this approach demands multiple independent iterations of inference for each pair of perspective images, increasing computational complexity and losing the global information in the ERP image. (c) Our proposed method, EDM, leverages the spherical camera model, rendering it robust against distortions. Warp refers to results obtained by multiplying the warped image with the predicted certainty map, demonstrating that our method yields more accurate dense matches.
  • Figure 2: Coordinate system.
  • Figure 3: Overview of our approach. It consists of three steps: Multi-scale Feature Extraction, Spherical Spatial Alignment Module (Sec. \ref{['section:ssam']}), and Geodesic Flow Refinement (Sec. \ref{['section:gfr']}).
  • Figure 4: Our Spherical Spatial Alignment Module. We present Spherical Positional Embedding (red dotted box). The embedding decoder generates the global matches $\hat{\mathbf{S}}^{\text{coarse}}_{{A} \to {B}}$. Here, the gray curved lines represent the geodesic flow between $\mathbf{S}_{A}$ and $\mathbf{S}_{B}$. $\oplus$ denotes concatenation, $\otimes$ means reshape and matrix multiplication. We provide the matrix dimensions of intermediate features for reference.
  • Figure 5: Our proposed Geodesic Flow Refinement. Refining the displacement along curved lines on the spherical surface presents significant challenges. To address this, we project the displacement into the ERP space for refinement (Cartesian to spherical) and subsequently unproject it back onto the spherical surface for further refinement (spherical to Cartesian).
  • ...and 6 more figures