Table of Contents
Fetching ...

NeRF-Texture: Synthesizing Neural Radiance Field Textures

Yi-Hua Huang, Yan-Pei Cao, Yu-Kun Lai, Ying Shan, Lin Gao

TL;DR

NeRF-Texture addresses the challenge of synthesizing textures with meso-structure and view-dependent appearance on 3D surfaces from multi-view data. It introduces a coarse–fine disentanglement where a base shape hosts latent texture features, enabling high-frequency Mesostructure synthesis via implicit patch matching and hash-grid retrieval, and extends this to curved surfaces through a high-resolution UV atlas and pyramid-based synthesis. A clustering constraint regularizes latent feature distributions to improve patch matching, while a SH/Phong-based shading model renders realistic view-dependent appearance. The approach achieves realistic NeRF-based textures on planar and curved geometries, supports texture transfer to new shapes, and offers real-time rendering performance with favorable comparisons and ablations against 2D textures and NeRF-Tex. This work enables robust, scalable texture synthesis from real-world multi-view data with broad implications for graphics and vision applications.

Abstract

Texture synthesis is a fundamental problem in computer graphics that would benefit various applications. Existing methods are effective in handling 2D image textures. In contrast, many real-world textures contain meso-structure in the 3D geometry space, such as grass, leaves, and fabrics, which cannot be effectively modeled using only 2D image textures. We propose a novel texture synthesis method with Neural Radiance Fields (NeRF) to capture and synthesize textures from given multi-view images. In the proposed NeRF texture representation, a scene with fine geometric details is disentangled into the meso-structure textures and the underlying base shape. This allows textures with meso-structure to be effectively learned as latent features situated on the base shape, which are fed into a NeRF decoder trained simultaneously to represent the rich view-dependent appearance. Using this implicit representation, we can synthesize NeRF-based textures through patch matching of latent features. However, inconsistencies between the metrics of the reconstructed content space and the latent feature space may compromise the synthesis quality. To enhance matching performance, we further regularize the distribution of latent features by incorporating a clustering constraint. In addition to generating NeRF textures over a planar domain, our method can also synthesize NeRF textures over curved surfaces, which are practically useful. Experimental results and evaluations demonstrate the effectiveness of our approach.

NeRF-Texture: Synthesizing Neural Radiance Field Textures

TL;DR

NeRF-Texture addresses the challenge of synthesizing textures with meso-structure and view-dependent appearance on 3D surfaces from multi-view data. It introduces a coarse–fine disentanglement where a base shape hosts latent texture features, enabling high-frequency Mesostructure synthesis via implicit patch matching and hash-grid retrieval, and extends this to curved surfaces through a high-resolution UV atlas and pyramid-based synthesis. A clustering constraint regularizes latent feature distributions to improve patch matching, while a SH/Phong-based shading model renders realistic view-dependent appearance. The approach achieves realistic NeRF-based textures on planar and curved geometries, supports texture transfer to new shapes, and offers real-time rendering performance with favorable comparisons and ablations against 2D textures and NeRF-Tex. This work enables robust, scalable texture synthesis from real-world multi-view data with broad implications for graphics and vision applications.

Abstract

Texture synthesis is a fundamental problem in computer graphics that would benefit various applications. Existing methods are effective in handling 2D image textures. In contrast, many real-world textures contain meso-structure in the 3D geometry space, such as grass, leaves, and fabrics, which cannot be effectively modeled using only 2D image textures. We propose a novel texture synthesis method with Neural Radiance Fields (NeRF) to capture and synthesize textures from given multi-view images. In the proposed NeRF texture representation, a scene with fine geometric details is disentangled into the meso-structure textures and the underlying base shape. This allows textures with meso-structure to be effectively learned as latent features situated on the base shape, which are fed into a NeRF decoder trained simultaneously to represent the rich view-dependent appearance. Using this implicit representation, we can synthesize NeRF-based textures through patch matching of latent features. However, inconsistencies between the metrics of the reconstructed content space and the latent feature space may compromise the synthesis quality. To enhance matching performance, we further regularize the distribution of latent features by incorporating a clustering constraint. In addition to generating NeRF textures over a planar domain, our method can also synthesize NeRF textures over curved surfaces, which are practically useful. Experimental results and evaluations demonstrate the effectiveness of our approach.

Paper Structure

This paper contains 32 sections, 6 equations, 22 figures, 3 tables, 1 algorithm.

Figures (22)

  • Figure 1: Given a set of multi-view images of the target texture with meso-structure, our model synthesizes Neural Radiance Field (NeRF) textures, which can then be applied to novel shapes, such as the skirt and hat in the figure, with rich geometric and appearance details.
  • Figure 2: Overview of our method. Given a set of multi-view images, we first estimate its base shape. Based on it, we model the scene with a disentangled representation of the base shape and NeRF texture with meso-structure. The query point $x$ is projected onto the base shape as footpoint $x_c$. Latent features $f(x),\hat{f}(x)$ representing textures are fetched by feeding $x_c$ to hash grids. Along with matrices of local tangent space $T_c(x)$, latent features $f(x),\hat{f}(x)$, and SDF value $s(x)$ are fed into the rendering module (RM). The density $\sigma(x)$, coefficients of Phong shading model $k_d(x),k_s(x),g(x)$, elevation and azimuth angles of the fine normal $\theta(x),\phi(x)$ are predicted based on the input features and SDF. The color $c(x)$ of the query point $x$ is calculated by Spherical Harmonic (SH) rendering based on the coarse and fine normals $n_c(x), n_f(x)$, viewing direction $d$, shading coefficients $k_d(x),k_s(x),g(x)$ and lighting SHs. Based on the implicit texture representation (ITR), we extract implicit patches from the base shape and synthesize texture by an implicit patch matching algorithm. By querying $f(x),\hat{f}(x)$ and $T_c(x)$ from the synthesized implicit textures, we are able to render the appearance of the synthesized texture.
  • Figure 3: Base Shape Extraction. We show the intermediate outputs during the base shape extraction, including NGP mueller2022instant, Co-ACD wei2022approximate, and re-meshing huang2018robustvollmer1999improved.
  • Figure 4: Illustration of Base Shape Projection in 2D. Point $x$ in Euclidean space is parameterized as the signed distance $s(x)$ and the projected footpoint $x_c$.
  • Figure 5: Shading Decomposition. Our model predicts the fine normal $n_f$ and decomposes the radiance into diffuse and specular components.
  • ...and 17 more figures