CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

Nikolai Kalischek; Michael Oechsle; Fabian Manhardt; Philipp Henzler; Konrad Schindler; Federico Tombari

CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

Nikolai Kalischek, Michael Oechsle, Fabian Manhardt, Philipp Henzler, Konrad Schindler, Federico Tombari

TL;DR

CubeDiff repurposes pretrained diffusion models to generate 360° panoramas by operating on cubemaps, treating each of the six faces as a perspective image and enabling cross-face coherence through inflated attention. Key innovations include synchronized GroupNorm across faces, cube-geometry positional encodings, overlapping face predictions, and classifier-free guidance, all trained on a diverse panorama corpus. Empirical results on Laval Indoor and SUN360 show state-of-the-art perceptual and text-alignment metrics, with strong generalization across text-only, image-only, and text+image conditioning, and an ability to perform fine-grained per-face text control. The approach achieves high-resolution, coherent panoramas with minimal architectural changes to existing diffusion models, offering practical impact for VR, gaming, and creative content generation.

Abstract

We introduce a novel method for generating 360° panoramas from text prompts or images. Our approach leverages recent advances in 3D generation by employing multi-view diffusion models to jointly synthesize the six faces of a cubemap. Unlike previous methods that rely on processing equirectangular projections or autoregressive generation, our method treats each face as a standard perspective image, simplifying the generation process and enabling the use of existing multi-view diffusion models. We demonstrate that these models can be adapted to produce high-quality cubemaps without requiring correspondence-aware attention layers. Our model allows for fine-grained text control, generates high resolution panorama images and generalizes well beyond its training set, whilst achieving state-of-the-art results, both qualitatively and quantitatively. Project page: https://cubediff.github.io/

CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

TL;DR

Abstract

CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (24)