SphereDrag: Spherical Geometry-Aware Panoramic Image Editing
Zhiao Feng, Xuewei Li, Junjie Yang, Jingchao Li, Yuxin Peng, Xi Li
TL;DR
This work tackles panoramic image editing by integrating spherical geometry into a diffusion-based editing framework. The authors introduce SphereDrag, featuring adaptive reprojection (AR) to mitigate boundary discontinuities, great-circle trajectory adjustment (GCTA) to align edits with spherical paths, and spherical search region tracking (SSRT) to compensate for latitude-based pixel density distortions. They also present PanoBench, a dedicated panoramic editing benchmark, enabling standardized evaluation across scenes and styles. Empirical results show SphereDrag achieves state-of-the-art geometric consistency and image quality, with notable improvements in the Image Fidelity metric and reductions in FID/sFID, demonstrating practical potential for accurate, controllable editing of 360-degree content.
Abstract
Image editing has made great progress on planar images, but panoramic image editing remains underexplored. Due to their spherical geometry and projection distortions, panoramic images present three key challenges: boundary discontinuity, trajectory deformation, and uneven pixel density. To tackle these issues, we propose SphereDrag, a novel panoramic editing framework utilizing spherical geometry knowledge for accurate and controllable editing. Specifically, adaptive reprojection (AR) uses adaptive spherical rotation to deal with discontinuity; great-circle trajectory adjustment (GCTA) tracks the movement trajectory more accurate; spherical search region tracking (SSRT) adaptively scales the search range based on spherical location to address uneven pixel density. Also, we construct PanoBench, a panoramic editing benchmark, including complex editing tasks involving multiple objects and diverse styles, which provides a standardized evaluation framework. Experiments show that SphereDrag gains a considerable improvement compared with existing methods in geometric consistency and image quality, achieving up to 10.5% relative improvement.
