Table of Contents
Fetching ...

LightMover: Generative Light Movement with Color and Intensity Controls

Gengze Zhou, Tianyu Wang, Soo Ye Kim, Zhixin Shu, Xin Yu, Yannick Hold-Geoffroy, Sumit Chaturvedi, Qi Wu, Zhe Lin, Scott Cohen

Abstract

We present LightMover, a framework for controllable light manipulation in single images that leverages video diffusion priors to produce physically plausible illumination changes without re-rendering the scene. We formulate light editing as a sequence-to-sequence prediction problem in visual token space: given an image and light-control tokens, the model adjusts light position, color, and intensity together with resulting reflections, shadows, and falloff from a single view. This unified treatment of spatial (movement) and appearance (color, intensity) controls improves both manipulation and illumination understanding. We further introduce an adaptive token-pruning mechanism that preserves spatially informative tokens while compactly encoding non-spatial attributes, reducing control sequence length by 41% while maintaining editing fidelity. To train our framework, we construct a scalable rendering pipeline that generates large numbers of image pairs across varied light positions, colors, and intensities while keeping the scene content consistent with the original image. LightMover enables precise, independent control over light position, color, and intensity, and achieves high PSNR and strong semantic consistency (DINO, CLIP) across different tasks.

LightMover: Generative Light Movement with Color and Intensity Controls

Abstract

We present LightMover, a framework for controllable light manipulation in single images that leverages video diffusion priors to produce physically plausible illumination changes without re-rendering the scene. We formulate light editing as a sequence-to-sequence prediction problem in visual token space: given an image and light-control tokens, the model adjusts light position, color, and intensity together with resulting reflections, shadows, and falloff from a single view. This unified treatment of spatial (movement) and appearance (color, intensity) controls improves both manipulation and illumination understanding. We further introduce an adaptive token-pruning mechanism that preserves spatially informative tokens while compactly encoding non-spatial attributes, reducing control sequence length by 41% while maintaining editing fidelity. To train our framework, we construct a scalable rendering pipeline that generates large numbers of image pairs across varied light positions, colors, and intensities while keeping the scene content consistent with the original image. LightMover enables precise, independent control over light position, color, and intensity, and achieves high PSNR and strong semantic consistency (DINO, CLIP) across different tasks.

Paper Structure

This paper contains 27 sections, 8 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: Results of light source movement. LightMover demonstrates robust control over light movement, color, and intensity in diverse illumination scenarios. It maintains global radiometric balance, reconstructs occluded illumination, and infers plausible light–object interactions under varying geometry and materials. Beyond single-attribute control, LightMover supports compositional manipulation that simultaneously adjusts light position, color, and intensity. These results highlight the potential of our 2.5D learning paradigm to achieve physically coherent lighting reasoning within a purely 2D generative framework.
  • Figure 2: Overview of LightMover. Left: our sequence-to-sequence formulation encodes a reference image, object crop, movement map, and optional color/intensity controls as input frames. Right: additional supported tasks, including light removal and insertion.
  • Figure 3: Illustration of training samples. Top: our synthetic data varies from scenes, objects, and lighting conditions; Bottom: the image relighting post-processing during training.
  • Figure 4: Comparison of light movement results. The blue bounding box in the input indicates the target light location. For Gemini-2.5-Flash-Image, the first column shows the two-step modular editing setup, and the second column shows the one-step editing result. Compared with ObjectMover, LightMover accurately handles light propagation and its interaction with the environment. Compared with Gemini-2.5-Flash-Image, LightMover provides more precise control and produces more coherent and photometrically consistent relighting.
  • Figure 5: Light insertion/removal results. Compared to Nano-Banana Gemini2.5Flash2025, LightMover produces more physically consistent edits, preserving background while correctly adding or removing illumination effects associated with the target light source.
  • ...and 13 more figures