Light4D: Training-Free Extreme Viewpoint 4D Video Relighting

Zhenghuang Wu; Kang Chen; Zeyu Zhang; Hao Tang

Light4D: Training-Free Extreme Viewpoint 4D Video Relighting

Zhenghuang Wu, Kang Chen, Zeyu Zhang, Hao Tang

TL;DR

Light4D addresses the challenge of training-free 4D video relighting under extreme viewpoint changes by coupling a geometry-focused EX-4D backbone with an illumination-prior IC-Light. It introduces Disentangled Flow Guidance (DFG) with a time-aware fusion schedule to separate geometry and lighting, and Temporal Consistent Attention (TCA) with deterministic regularization to ensure temporal stability. The approach is evaluated against training-based and training-free baselines across diverse scenes and viewpoints up to $180^{\circ}$, showing strong lighting fidelity and reduced flicker while preserving 4D geometry. Its modular design enables integration of future video-native illumination models and more powerful 4D backbones, offering scalable potential for controllable 4D content creation.

Abstract

Recent advances in diffusion-based generative models have established a new paradigm for image and video relighting. However, extending these capabilities to 4D relighting remains challenging, due primarily to the scarcity of paired 4D relighting training data and the difficulty of maintaining temporal consistency across extreme viewpoints. In this work, we propose Light4D, a novel training-free framework designed to synthesize consistent 4D videos under target illumination, even under extreme viewpoint changes. First, we introduce Disentangled Flow Guidance, a time-aware strategy that effectively injects lighting control into the latent space while preserving geometric integrity. Second, to reinforce temporal consistency, we develop Temporal Consistent Attention within the IC-Light architecture and further incorporate deterministic regularization to eliminate appearance flickering. Extensive experiments demonstrate that our method achieves competitive performance in temporal consistency and lighting fidelity, robustly handling camera rotations from -90 to 90. Code: https://github.com/AIGeeksGroup/Light4D. Website: https://aigeeksgroup.github.io/Light4D.

Light4D: Training-Free Extreme Viewpoint 4D Video Relighting

TL;DR

, showing strong lighting fidelity and reduced flicker while preserving 4D geometry. Its modular design enables integration of future video-native illumination models and more powerful 4D backbones, offering scalable potential for controllable 4D content creation.

Abstract

Paper Structure (25 sections, 10 equations, 10 figures, 5 tables)

This paper contains 25 sections, 10 equations, 10 figures, 5 tables.

Introduction
Related Work
The Proposed Method
Overview
Disentangled Flow Guidance
Temporal Consistent Attention
Experiment
Experimental Setup
Main Results
Ablation Studies
Limitation and Future Work
Conclusion
Additional Ablation Studies
Ablation on Geometric Isolation Phase ($\tau_g$).
Deterministic Coherence and Regularization
...and 10 more sections

Figures (10)

Figure 1: Overview of the Light4D framework. Our training-free approach employs a time-aware paradigm in a latent flow-matching process. Using a multi-phase adaptive schedule $\lambda(t)$, we prioritize 3D geometric completion via the EX-4D backbone before injecting illumination cues through IC-Light.
Figure 2: Design of Temporal Consistent Attention (TCA). TCA enforces temporal coherence through a dual-path mechanism that interpolates between standard self-attention (Path A) for frame-specific structure and a consistent path (Path B) that regularizes appearance context via Gaussian-weighted sliding window aggregation.
Figure 3: Qualitative relighting results. Comparison of baselines and our method under two prompts: "Sunlight" (left) and "Pink neon light" (right). Our method yields more stable illumination changes over time and reduces temporal flicker.
Figure 4: 4D video quality under extreme viewpoints. Qualitative comparison of baselines and our method using the prompt "Sunlight". Under extreme viewpoint changes, our method better balances relighting coherence, 4D geometric stability, and detail fidelity.
Figure 5: Quantitative ablation on the geometric isolation threshold $\tau_g$. (a) HFPR scores, (b) Temporal CLIP scores, and (c) Motion Flow L1 errors are shown.
...and 5 more figures

Light4D: Training-Free Extreme Viewpoint 4D Video Relighting

TL;DR

Abstract

Light4D: Training-Free Extreme Viewpoint 4D Video Relighting

Authors

TL;DR

Abstract

Table of Contents

Figures (10)