Intrinsic PAPR for Point-level 3D Scene Albedo and Shading Editing

Alireza Moazeni; Shichong Peng; Ke Li

Intrinsic PAPR for Point-level 3D Scene Albedo and Shading Editing

Alireza Moazeni, Shichong Peng, Ke Li

TL;DR

Intrinsic PAPR tackles the challenge of point-level 3D albedo and shading editing from multi-view RGB images by directly decomposing per-point features into albedo and shading within a proximal-attention point rendering framework. Building on PAPR, the method learns per-point albedo and shading representations and supervises albedo via a pretrained intrinsic decomposition model, enabling 3D-consistent editing without heavy shading priors. Experiments show superior novel-view rendering quality and precise point-level albedo and shading edits, including transfer across regions and scenes, as well as shading-intensity controls. The approach offers a scalable, editing-friendly alternative to inverse rendering while highlighting limitations related to pretrained-albedo bias and potential societal implications of 3D content manipulation.

Abstract

Recent advancements in neural rendering have excelled at novel view synthesis from multi-view RGB images. However, they often lack the capability to edit the shading or colour of the scene at a detailed point-level, while ensuring consistency across different viewpoints. In this work, we address the challenge of point-level 3D scene albedo and shading editing from multi-view RGB images, focusing on detailed editing at the point-level rather than at a part or global level. While prior works based on volumetric representation such as NeRF struggle with achieving 3D consistent editing at the point level, recent advancements in point-based neural rendering show promise in overcoming this challenge. We introduce ``Intrinsic PAPR'', a novel method based on the recent point-based neural rendering technique Proximity Attention Point Rendering (PAPR). Unlike other point-based methods that model the intrinsic decomposition of the scene, our approach does not rely on complicated shading models or simplistic priors that may not universally apply. Instead, we directly model scene decomposition into albedo and shading components, leading to better estimation accuracy. Comparative evaluations against the latest point-based inverse rendering methods demonstrate that Intrinsic PAPR achieves higher-quality novel view rendering and superior point-level albedo and shading editing.

Intrinsic PAPR for Point-level 3D Scene Albedo and Shading Editing

TL;DR

Abstract

Paper Structure (34 sections, 3 equations, 18 figures, 5 tables)

This paper contains 34 sections, 3 equations, 18 figures, 5 tables.

Introduction
Related Work
Neural Scene Representation
Intrinsic Decomposition
Method
Preliminaries: Intrinsic Decomposition
Choice of Scene Representation
Overview: Proximity Attention Point Rendering (PAPR)
Intrinsic PAPR
Training Details
Experiments
Novel View Synthesis
Point-level Albedo and Shading Editing
Point-level Albedo Transfer
Point-level Shading Transfer
...and 19 more sections

Figures (18)

Figure 1: We introduce Intrinsic PAPR, a novel method for point-level 3D scene albedo and shading editing. By leveraging the recent point-based rendering technique, PAPR, our method models the scene decomposition into albedo and shading components. This enables detailed, point-level albedo and shading edits that remain consistent across different viewpoints.
Figure 2: An overview of the rendering pipeline. Each point in our scene representation contains a spatial position, albedo, shading feature vectors, and an influence score. (a) Ray-dependent embeddings are generated for each point, incorporating the key, albedo value, and shading value, along with the ray direction forming the query, which together serve as inputs to the attention model. (b) The attention model uses the key and query to select points and combine their albedo and shading values, producing corresponding feature maps. (c) The albedo feature map is fed to an albedo feature renderer (green) to generate the albedo image. Both shading and albedo feature maps are input to a separate feature renderer (blue) for generating the final colour image output. The model is trained end-to-end with supervision on both the albedo output and the colour image output.
Figure 3: Illustrative comparison between splat-based renderers and attention-based renderers like PAPR. (a) Splat-based methods render scene appearance using information stored at discrete splats. Due to their discrete nature, changes made to splats on one side do not influence those on the other side, resulting in an abrupt transition at the boundary. (b) PAPR renders appearance through interpolation among points. The interpolated appearance between edited points on one side and unedited ones on the other side naturally creates a smooth transition.
Figure 4: Point-level Albedo Transfer: Qualitative comparison of point-level albedo transfer on the NeRF synthetic dataset Mildenhall2020NeRFRS. Our method effectively transfers the albedo from a source point (marked in blue) to target points (marked in red) by transferring its albedo feature vector. This transfer maintains the surface details and optical properties of the target region. Furthermore, our method ensures that the shading intensity at the target region remains unaffected by the shading at the source point. In contrast, the latest inverse rendering baselines struggle to transfer the correct colour (Lego, Hotdog), lose surface details (Lego, Materials) and optical properties like reflectivity (Materials), and fail to preserve shading at the target regions post-transfer (Hotdog).
Figure 5: Point-level Shading Transfer: Qualitative comparison of point-level shading transfer on the NeRF synthetic dataset Mildenhall2020NeRFRS. Our method demonstrates effective shading transfer from a source point (marked in blue) to target points (marked in red) by transferring its shading feature vector. Importantly, our method preserves the albedo at the target region without introducing the albedo from the source point. This highlights our method's capability to encode distinct information in albedo and shading features. In contrast, GS-IR liang2023gs struggles to accurately transfer shading values from the source, resulting in a similar appearance to the original view and introducing high-frequency noise (Lego, Hotdog, Chair). DPIR chung2023differentiable also inaccurately transfers shading and fails to decouple shading from albedo in the source point, leading to undesired changes in the albedo of the target area (Lego).
...and 13 more figures

Intrinsic PAPR for Point-level 3D Scene Albedo and Shading Editing

TL;DR

Abstract

Intrinsic PAPR for Point-level 3D Scene Albedo and Shading Editing

Authors

TL;DR

Abstract

Table of Contents

Figures (18)