Table of Contents
Fetching ...

Relightable Holoported Characters: Capturing and Relighting Dynamic Human Performance from Sparse Views

Kunwar Maheep Singh, Jianchun Chen, Vladislav Golyanik, Stephan J. Garbin, Thabo Beeler, Rishabh Dabral, Marc Habermann, Christian Theobalt

TL;DR

This work tackles relighting dynamic human performances from sparse RGB views by introducing Relightable Holoported Characters (RHC), an end-to-end pipeline that animates a subject-specific mesh, computes physics-informed features in UV space, and uses a transformer-based RelightNet to predict texel-aligned 3D Gaussians for relit appearance in unseen lighting and viewpoints. A novel multi-view lightstage capture strategy interleaves environment-map lighting with tracking frames, enabling diverse illumination and accurate geometry guidance without OLAT. The method demonstrates superior visual fidelity and lighting reproduction over state-of-the-art baselines across multiple subjects and unseen motions, with ablations confirming the importance of geometry, albedo, shading, and cross-attention mechanisms. While limited by identity specificity and runtime, RHC opens pathways for scalable, photorealistic relighting of dynamic human avatars in virtual environments and telepresence.

Abstract

We present Relightable Holoported Characters (RHC), a novel person-specific method for free-view rendering and relighting of full-body and highly dynamic humans solely observed from sparse-view RGB videos at inference. In contrast to classical one-light-at-a-time (OLAT)-based human relighting, our transformer-based RelightNet predicts relit appearance within a single network pass, avoiding costly OLAT-basis capture and generation. For training such a model, we introduce a new capture strategy and dataset recorded in a multi-view lightstage, where we alternate frames lit by random environment maps with uniformly lit tracking frames, simultaneously enabling accurate motion tracking and diverse illumination as well as dynamics coverage. Inspired by the rendering equation, we derive physics-informed features that encode geometry, albedo, shading, and the virtual camera view from a coarse human mesh proxy and the input views. Our RelightNet then takes these features as input and cross-attends them with a novel lighting condition, and regresses the relit appearance in the form of texel-aligned 3D Gaussian splats attached to the coarse mesh proxy. Consequently, our RelightNet implicitly learns to efficiently compute the rendering equation for novel lighting conditions within a single feed-forward pass. Experiments demonstrate our method's superior visual fidelity and lighting reproduction compared to state-of-the-art approaches. Project page: https://vcai.mpi-inf.mpg.de/projects/RHC/

Relightable Holoported Characters: Capturing and Relighting Dynamic Human Performance from Sparse Views

TL;DR

This work tackles relighting dynamic human performances from sparse RGB views by introducing Relightable Holoported Characters (RHC), an end-to-end pipeline that animates a subject-specific mesh, computes physics-informed features in UV space, and uses a transformer-based RelightNet to predict texel-aligned 3D Gaussians for relit appearance in unseen lighting and viewpoints. A novel multi-view lightstage capture strategy interleaves environment-map lighting with tracking frames, enabling diverse illumination and accurate geometry guidance without OLAT. The method demonstrates superior visual fidelity and lighting reproduction over state-of-the-art baselines across multiple subjects and unseen motions, with ablations confirming the importance of geometry, albedo, shading, and cross-attention mechanisms. While limited by identity specificity and runtime, RHC opens pathways for scalable, photorealistic relighting of dynamic human avatars in virtual environments and telepresence.

Abstract

We present Relightable Holoported Characters (RHC), a novel person-specific method for free-view rendering and relighting of full-body and highly dynamic humans solely observed from sparse-view RGB videos at inference. In contrast to classical one-light-at-a-time (OLAT)-based human relighting, our transformer-based RelightNet predicts relit appearance within a single network pass, avoiding costly OLAT-basis capture and generation. For training such a model, we introduce a new capture strategy and dataset recorded in a multi-view lightstage, where we alternate frames lit by random environment maps with uniformly lit tracking frames, simultaneously enabling accurate motion tracking and diverse illumination as well as dynamics coverage. Inspired by the rendering equation, we derive physics-informed features that encode geometry, albedo, shading, and the virtual camera view from a coarse human mesh proxy and the input views. Our RelightNet then takes these features as input and cross-attends them with a novel lighting condition, and regresses the relit appearance in the form of texel-aligned 3D Gaussian splats attached to the coarse mesh proxy. Consequently, our RelightNet implicitly learns to efficiently compute the rendering equation for novel lighting conditions within a single feed-forward pass. Experiments demonstrate our method's superior visual fidelity and lighting reproduction compared to state-of-the-art approaches. Project page: https://vcai.mpi-inf.mpg.de/projects/RHC/

Paper Structure

This paper contains 23 sections, 16 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 1: We present Relightable Holoported Characters, the first method that takes sparse RGB images of a full-body human and generates photorealistic and relightable renderings, allowing for seamless placement of photoreal twins of real and dynamically moving humans into virtual environments.
  • Figure 2: Illustration of our data capture strategy. To learn a relightable full-body avatar, we propose to capture multi-view video sequences consisting of consecutive uniformly lit tracking frames and relit frames obtained by randomly projecting environment maps onto the lightstage LEDs.
  • Figure 3: Our approach, Relightable Holoported Characters (RHC). Given four input views under flat lighting, skeleton pose, environment map, and camera parameters, our method generates photorealistic relighting. First, a mesh-based avatar is animated using the skeleton pose (Sec. \ref{['sec:char_model']}). Physics-informed features (Sec. \ref{['sec:feat']})—Geometry, Albedo, Shading, and View Features—are extracted from sparse-view images and mesh tracking, and fed into RelightNet (Sec. \ref{['sec:relightnet']}), which uses cross-attention to condition on the environment map. RelightNet predicts per-texel Gaussian parameters, which are placed on the mesh and splatted into the camera view.
  • Figure 4: Qualitative results. Here, we show the 4 sparse input views of a person performing unseen motions. Our method, RHC, is then able to photorealistically render free-views under novel lighting conditions. Note that we also visualize different poses here, demonstrating the robustness to arbitrary skeletal motions.
  • Figure 5: Qualitative comparison. We compare our method against state-of-the-art methods, including Relighting4D (R4D + GT Env.) chen2022relighting4d with ground-truth environment maps, as their original setting differs from our task. We also compare to advanced R4D variants, IA wang2024intrinsicavatar and MA chen2024meshavatar, augmented with ground-truth maps. Additionally, we compare to a sparse image-driven, non-relightable method, Holoported Characters (HPC) shetty2024holoported, and a relightable version (HPC + NG) using a recent image-based relighting network jin2024neural. Our method consistently outperforms all baselines across subjects and environment maps, highlighting its superior performance.
  • ...and 11 more figures