PanFlow: Decoupled Motion Control for Panoramic Video Generation
Cheng Zhang, Hanwen Liang, Donny Y. Chen, Qianyi Wu, Konstantinos N. Plataniotis, Camilo Cruz Gambardella, Jianfei Cai
TL;DR
PanFlow tackles the challenge of motion control in panoramic video generation by decoupling camera rotation from derotated flow and introducing loop-consistent diffusion via spherical noise warping and latent rotation. It presents a decoupled motion framework that leverages spherical optical flow to separate rotation from translation/object motion, followed by inverse rotation to recover full motion. A motion-rich panoramic dataset with frame-level pose and flow annotations supports robust training and evaluation. Empirical results show PanFlow surpasses prior methods in motion fidelity, temporal coherence, and visual quality, and it demonstrates practical applications in motion transfer and video editing.
Abstract
Panoramic video generation has attracted growing attention due to its applications in virtual reality and immersive media. However, existing methods lack explicit motion control and struggle to generate scenes with large and complex motions. We propose PanFlow, a novel approach that exploits the spherical nature of panoramas to decouple the highly dynamic camera rotation from the input optical flow condition, enabling more precise control over large and dynamic motions. We further introduce a spherical noise warping strategy to promote loop consistency in motion across panorama boundaries. To support effective training, we curate a large-scale, motion-rich panoramic video dataset with frame-level pose and flow annotations. We also showcase the effectiveness of our method in various applications, including motion transfer and video editing. Extensive experiments demonstrate that PanFlow significantly outperforms prior methods in motion fidelity, visual quality, and temporal coherence. Our code, dataset, and models are available at https://github.com/chengzhag/PanFlow.
