LDM: Large Tensorial SDF Model for Textured Mesh Generation

Rengan Xie; Wenting Zheng; Kai Huang; Yizheng Chen; Qi Wang; Qi Ye; Wei Chen; Yuchi Huo

LDM: Large Tensorial SDF Model for Textured Mesh Generation

Rengan Xie, Wenting Zheng, Kai Huang, Yizheng Chen, Qi Wang, Qi Ye, Wei Chen, Yuchi Huo

TL;DR

This work tackles fast, high-quality 3D asset generation from text or a single image without per-object optimization. It introduces LDM, a feed-forward pipeline that uses conditional multi-view diffusion to generate four-view inputs and a transformer-based tensorial SDF reconstructor to produce a unified tensorial SDF field, followed by a gradient-based mesh refinement. The method represents geometry and appearance with a shared tensorial SDF and decouples color into albedo and shading, enabling reliable relighting and material editing. A two-stage training regime—volume rendering for global features and FlexiCube-based local refinement—yields high-quality textured meshes in seconds and outperforms prior methods on color and geometry metrics.

Abstract

Previous efforts have managed to generate production-ready 3D assets from text or images. However, these methods primarily employ NeRF or 3D Gaussian representations, which are not adept at producing smooth, high-quality geometries required by modern rendering pipelines. In this paper, we propose LDM, a novel feed-forward framework capable of generating high-fidelity, illumination-decoupled textured mesh from a single image or text prompts. We firstly utilize a multi-view diffusion model to generate sparse multi-view inputs from single images or text prompts, and then a transformer-based model is trained to predict a tensorial SDF field from these sparse multi-view image inputs. Finally, we employ a gradient-based mesh optimization layer to refine this model, enabling it to produce an SDF field from which high-quality textured meshes can be extracted. Extensive experiments demonstrate that our method can generate diverse, high-quality 3D mesh assets with corresponding decomposed RGB textures within seconds.

LDM: Large Tensorial SDF Model for Textured Mesh Generation

TL;DR

Abstract

Paper Structure (26 sections, 7 equations, 13 figures, 3 tables)

This paper contains 26 sections, 7 equations, 13 figures, 3 tables.

Introduction
Related Work
Diffusion Models for Multi-view Synthesis
Lifting 2D Diffusion for 3D Generation
Feed-forward 3D Generative Models
Method
Conditional Multi-view Generation
Tensorial SDF and Decoupled Color Field
Feed-forward Large Reconstruction Model
Lifting SDF to Fine Mesh
Experiment
Implementation Details
Training Datasets
Training details
Comparison
...and 11 more sections

Figures (13)

Figure 1: Given a text prompt or a single image, our framework can generate corresponding high-quality 3D assets within seconds, including illumination-decoupled texture maps, facilitating integration into various applications, such as relighting and material editing.
Figure 2: The overview of our framework. When given an image or text prompt condition, we first utilize a diffusion model to generate multiple viewpoint images. These images are then encoded into image feature tokens using the DINO2 image encoder. Subsequently, these tokens are fed into a transform-based tensorial object reconstructor, resulting in a tensorial SDF representation. The tensorial SDF representation can be further rendered using volume rendering or the Flexicube render layer to produce images or extract meshes.
Figure 3: Comparing model training performance across different Beta schedules.
Figure 4: Qualitative comparison with baselines shows that our method produces high-quality 3D assets with smooth geometry and clear textures, which align well with the input image.
Figure 5: The effect of illumination decoupled texture. We perform relighting in new scenes for both illumination-decomposed textures and non-decomposed textures. The 3D assets without illumination decomposition display incorrect shadows in the new scenes.
...and 8 more figures

LDM: Large Tensorial SDF Model for Textured Mesh Generation

TL;DR

Abstract

LDM: Large Tensorial SDF Model for Textured Mesh Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (13)