Geometry Fidelity for Spherical Images
Anders Christensen, Nooshin Mojab, Khushman Patel, Karan Ahuja, Zeynep Akata, Ole Winther, Mar Gonzalez-Franco, Andrea Colaco
TL;DR
The paper identifies a fundamental gap in evaluating spherical image generation using standard Fréchet Inception Distance (FID), which overlooks geometry-specific distortions. It introduces OmniFID, a cubemap-based extension of FID that aggregates per-view distributions across three view groups $\{U,D,\mathcal{F}\}$ to capture field-of-view fidelity via $OmniFID(X_1,X_2) = \frac{1}{3} \sum_{V\in\{U,D,\mathcal{F}\}} \overline{FID}(\mathcal{C}^{X_1}_V, \mathcal{C}^{X_2}_V)$, while preserving FID’s sensitivity to noise. It also defines Discontinuity Score (DS), a kernel-based measure of seam alignment across borders in equirectangular representations, with $DS(I) = \frac{L}{H_E} \sum_i DS(a_i)$ and $DS(a) = \frac{1}{2L} \sum_{y=0}^{L-1} \left( \frac{|\hat{a}(2,y)|}{|\hat{a}(1,y)|+c} + \frac{|\hat{a}(3,y)|}{|\hat{a}(4,y)|+c} \right)$. Through experiments on datasets like 360-Indoor, the authors show OmniFID detects vertical FOV reductions that FID misses, while DS correlates with seam severity across resolutions. These metrics collectively advance geometry-aware evaluation for spherical image generation, enabling more reliable benchmarking and guiding future metric development and dataset design for panoramic imagery.
Abstract
Spherical or omni-directional images offer an immersive visual format appealing to a wide range of computer vision applications. However, geometric properties of spherical images pose a major challenge for models and metrics designed for ordinary 2D images. Here, we show that direct application of Fréchet Inception Distance (FID) is insufficient for quantifying geometric fidelity in spherical images. We introduce two quantitative metrics accounting for geometric constraints, namely Omnidirectional FID (OmniFID) and Discontinuity Score (DS). OmniFID is an extension of FID tailored to additionally capture field-of-view requirements of the spherical format by leveraging cubemap projections. DS is a kernel-based seam alignment score of continuity across borders of 2D representations of spherical images. In experiments, OmniFID and DS quantify geometry fidelity issues that are undetected by FID.
