Table of Contents
Fetching ...

LightCtrl: Training-free Controllable Video Relighting

Yizuo Peng, Xuelin Chen, Kai Zhang, Xiaodong Cun

Abstract

Recent diffusion models have achieved remarkable success in image relighting, and this success has quickly been extended to video relighting. However, existing methods offer limited explicit control over illumination in the relighted output. We present LightCtrl, the first controllable video relighting method that enables explicit control of video illumination through a user-supplied light trajectory in a training-free manner. Our approach combines pre-trained diffusion models: an image relighting model processes each frame individually, followed by a video diffusion prior to enhance temporal consistency. To achieve explicit control over dynamically varying lighting, we introduce two key components. First, a Light Map Injection module samples light trajectory-specific noise and injects it into the latent representation of the source video, improving illumination coherence with the conditional light trajectory. Second, a Geometry-Aware Relighting module dynamically combines RGB and normal map latents in the frequency domain to suppress the influence of the original lighting, further enhancing adherence to the input light trajectory. Experiments show that LightCtrl produces high-quality videos with diverse illumination changes that closely follow the specified light trajectory, demonstrating improved controllability over baseline methods. Code is available at: https://github.com/GVCLab/LightCtrl.

LightCtrl: Training-free Controllable Video Relighting

Abstract

Recent diffusion models have achieved remarkable success in image relighting, and this success has quickly been extended to video relighting. However, existing methods offer limited explicit control over illumination in the relighted output. We present LightCtrl, the first controllable video relighting method that enables explicit control of video illumination through a user-supplied light trajectory in a training-free manner. Our approach combines pre-trained diffusion models: an image relighting model processes each frame individually, followed by a video diffusion prior to enhance temporal consistency. To achieve explicit control over dynamically varying lighting, we introduce two key components. First, a Light Map Injection module samples light trajectory-specific noise and injects it into the latent representation of the source video, improving illumination coherence with the conditional light trajectory. Second, a Geometry-Aware Relighting module dynamically combines RGB and normal map latents in the frequency domain to suppress the influence of the original lighting, further enhancing adherence to the input light trajectory. Experiments show that LightCtrl produces high-quality videos with diverse illumination changes that closely follow the specified light trajectory, demonstrating improved controllability over baseline methods. Code is available at: https://github.com/GVCLab/LightCtrl.

Paper Structure

This paper contains 19 sections, 6 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: LightCtrl can relight an input video to produce high-quality results with strong temporal consistency and, particularly, illumination that closely follows the user-specified light trajectory.
  • Figure 2: Overall Pipeline. We perform controllable video relighting with a user-provided light trajectory. Where we inject the light map on the noisy latent of VDM using the light map injection module. Then, in each denosing step, we design a geometry-aware relighting module to produce the relighted results frame-wise. Thus, the VDM can help to generate consistent video results with controllable lighting.
  • Figure 3: The framework of Geometry-Aware Relighting Module. This module is designed for frame-wise relighting. Firstly, the normal sequence of the source video and each consistent target derived from the denoising steps are encoded into the latent space and further process them in the frequency domain. Then, their spatio-temporal low-frequency components are dynamically fused with the high-frequency components of the consistent latents, with the cut-off frequency varying at each denoising step to add details. Finally, these lantents are transformed back to the video space and produce the relighted frames using frame-wise IC-Light iclight.
  • Figure 4: Qualitative comparison of light trajectory control. We compare our LightCtrl with IC-Light, IC-Light + SDEdit-0.2, IC-Light + SDEdit-0.6, and LAV-Traj. Compared to baselines, LightCtrl produces high-quality, temporally coherent, and controllable relighting results, closely following the input light trajectories.
  • Figure 5: Ablation Study. Results of controllable video relighting with the Geometry-Aware Relighting (GAR) module or the Light Map Injection (LMI) module removed. The red box indicates the light distribution brought by the input video.
  • ...and 9 more figures