SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

Yaniv Benny; Lior Wolf

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

Yaniv Benny, Lior Wolf

TL;DR

A transformer-based architecture that, by incorporating a novel "Spherical Local Self-Attention" and other spherically-oriented modules, successfully operates in the spherical domain and outperforms the state-of-the-art in 360° perception benchmarks for depth estimation and semantic segmentation is introduced.

Abstract

This paper proposes a novel method for omnidirectional 360$\degree$ perception. Most common previous methods relied on equirectangular projection. This representation is easily applicable to 2D operation layers but introduces distortions into the image. Other methods attempted to remove the distortions by maintaining a sphere representation but relied on complicated convolution kernels that failed to show competitive results. In this work, we introduce a transformer-based architecture that, by incorporating a novel ``Spherical Local Self-Attention'' and other spherically-oriented modules, successfully operates in the spherical domain and outperforms the state-of-the-art in 360$\degree$ perception benchmarks for depth estimation and semantic segmentation.

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

TL;DR

Abstract

This paper proposes a novel method for omnidirectional 360

perception. Most common previous methods relied on equirectangular projection. This representation is easily applicable to 2D operation layers but introduces distortions into the image. Other methods attempted to remove the distortions by maintaining a sphere representation but relied on complicated convolution kernels that failed to show competitive results. In this work, we introduce a transformer-based architecture that, by incorporating a novel ``Spherical Local Self-Attention'' and other spherically-oriented modules, successfully operates in the spherical domain and outperforms the state-of-the-art in 360

perception benchmarks for depth estimation and semantic segmentation.

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

TL;DR

Abstract

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)