Table of Contents
Fetching ...

Gimbal360: Differentiable Auto-Leveling for Canonicalized $360^\circ$ Panoramic Image Completion

Yuqin Lu, Haofeng Liu, Yang Zhou, Jun Liang, Shengfeng He, Jing Li

Abstract

Diffusion models excel at 2D outpainting, but extending them to $360^\circ$ panoramic completion from unposed perspective images is challenging due to the geometric and topological mismatch between perspective projections and spherical panoramas. We present Gimbal360, a principled framework that explicitly bridges perspective observations and spherical panoramas. We introduce a Canonical Viewing Space that regularizes projective geometry and provides a consistent intermediate representation between the two domains. To anchor in-the-wild inputs to this space, we propose a Differentiable Auto-Leveling module that stabilizes feature orientation without requiring camera parameters at inference. Panoramic generation also introduces a topological challenge. Standard generative architectures assume a bounded Euclidean image plane, while Equirectangular Projection (ERP) panoramas exhibit intrinsic $S^1$ periodicity. Euclidean operations therefore break boundary continuity. We address this mismatch by enforcing topological equivariance in the latent space to preserve seamless periodic structure. To support this formulation, we introduce Horizon360, a curated large-scale dataset of gravity-aligned panoramic environments. Extensive experiments show that explicitly standardizing geometric and topological priors enables Gimbal360 to achieve state-of-the-art performance in structurally consistent $360^\circ$ scene completion.

Gimbal360: Differentiable Auto-Leveling for Canonicalized $360^\circ$ Panoramic Image Completion

Abstract

Diffusion models excel at 2D outpainting, but extending them to panoramic completion from unposed perspective images is challenging due to the geometric and topological mismatch between perspective projections and spherical panoramas. We present Gimbal360, a principled framework that explicitly bridges perspective observations and spherical panoramas. We introduce a Canonical Viewing Space that regularizes projective geometry and provides a consistent intermediate representation between the two domains. To anchor in-the-wild inputs to this space, we propose a Differentiable Auto-Leveling module that stabilizes feature orientation without requiring camera parameters at inference. Panoramic generation also introduces a topological challenge. Standard generative architectures assume a bounded Euclidean image plane, while Equirectangular Projection (ERP) panoramas exhibit intrinsic periodicity. Euclidean operations therefore break boundary continuity. We address this mismatch by enforcing topological equivariance in the latent space to preserve seamless periodic structure. To support this formulation, we introduce Horizon360, a curated large-scale dataset of gravity-aligned panoramic environments. Extensive experiments show that explicitly standardizing geometric and topological priors enables Gimbal360 to achieve state-of-the-art performance in structurally consistent scene completion.
Paper Structure (35 sections, 17 equations, 13 figures, 1 table)

This paper contains 35 sections, 17 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: We propose Gimbal360, a framework that auto-levels perspective inputs and performs geometry-aware $360^\circ$ completion in a canonical viewing space. Our method preserves global structure and maintains continuity across the periodic panorama seams (red boxes). Please zoom in for a better view.
  • Figure 2: Overview of the Gimbal360 framework. Given a perspective sample from our Horizon360 dataset, the Differentiable Auto-Leveling module predicts a rigid correspondence field to warp the perspective image into a gravity-aligned, yaw-centered Canonical Viewing Space. During Topologically Equivariant Generation, a Siamese Consistency Loss between the standard and horizontally shifted latent streams forces the network to natively respect the continuous $S^1$ boundary.
  • Figure 3: Qualitative comparison in indoor scene. Challenging vertical extents stress-test geometric consistency: our method corrects tilted inputs to plumb walls and level floors, versus baseline artifacts (tilted horizons, barrel distortion).
  • Figure 4: Qualitative comparison in outdoor scene. Our method consistently produces gravity-aligned panoramas with straight horizon lines and coherent architectural structures, while competing methods exhibit horizon tilts, curved distortions, or boundary discontinuities.
  • Figure 5: Qualitative in-the-wild results. We show diverse real-world inputs with unknown camera parameters or wide-angle shots with significant roll angles.
  • ...and 8 more figures