Table of Contents
Fetching ...

FTSplat: Feed-forward Triangle Splatting Network

Xiong Jinlin, Li Can, Shen Jiawei, Qi Zhigang, Sun Lei, Zhao Dongyang

TL;DR

This work proposes a feed-forward framework for triangle primitive generation that directly predicts continuous triangle surfaces from calibrated multi-view images and introduces a pixel-aligned triangle generation module and incorporates relative 3D point cloud supervision to enhance geometric learning stability and consistency.

Abstract

High-fidelity three-dimensional (3D) reconstruction is essential for robotics and simulation. While Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) achieve impressive rendering quality, their reliance on time-consuming per-scene optimization limits real-time deployment. Emerging feed-forward Gaussian splatting methods improve efficiency but often lack explicit, manifold geometry required for direct simulation. To address these limitations, we propose a feed-forward framework for triangle primitive generation that directly predicts continuous triangle surfaces from calibrated multi-view images. Our method produces simulation-ready models in a single forward pass, obviating the need for per-scene optimization or post-processing. We introduce a pixel-aligned triangle generation module and incorporate relative 3D point cloud supervision to enhance geometric learning stability and consistency. Experiments demonstrate that our method achieves efficient reconstruction while maintaining seamless compatibility with standard graphics and robotic simulators.

FTSplat: Feed-forward Triangle Splatting Network

TL;DR

This work proposes a feed-forward framework for triangle primitive generation that directly predicts continuous triangle surfaces from calibrated multi-view images and introduces a pixel-aligned triangle generation module and incorporates relative 3D point cloud supervision to enhance geometric learning stability and consistency.

Abstract

High-fidelity three-dimensional (3D) reconstruction is essential for robotics and simulation. While Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) achieve impressive rendering quality, their reliance on time-consuming per-scene optimization limits real-time deployment. Emerging feed-forward Gaussian splatting methods improve efficiency but often lack explicit, manifold geometry required for direct simulation. To address these limitations, we propose a feed-forward framework for triangle primitive generation that directly predicts continuous triangle surfaces from calibrated multi-view images. Our method produces simulation-ready models in a single forward pass, obviating the need for per-scene optimization or post-processing. We introduce a pixel-aligned triangle generation module and incorporate relative 3D point cloud supervision to enhance geometric learning stability and consistency. Experiments demonstrate that our method achieves efficient reconstruction while maintaining seamless compatibility with standard graphics and robotic simulators.
Paper Structure (12 sections, 6 equations, 6 figures, 3 tables)

This paper contains 12 sections, 6 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of the proposed FTSplat. Given multi-view input images, our feed-forward FTSplat directly and efficiently predicts a triangular surface representation of the scene. The reconstructed mesh supports photo-realistic novel view rendering and can be readily imported into simulation software such as Blender for downstream applications. Compared to existing optimization-based triangular surface methods that typically require several minutes for reconstruction, our approach enables scene modeling within sub-second.
  • Figure 2: Overview of the proposed feed-forward triangular surface reconstruction network. Multi-view images are processed by a Multi-View Depth Estimation module to obtain fused features enriched with depth information. The fused features are used to predict depth maps and back-project an initial 3D point cloud, while a 2D U-Net with a triangle head decodes additional vertex attributes (opacity and spherical harmonics color). A surface generation module infers face connectivity to produce the final triangular surface. Differentiable rasterization enables photometric supervision, and external 3D point cloud supervision provides explicit geometric constraints during training.
  • Figure 3: Qualitative comparison of reconstruction quality between our method and optimization-based triangle rasterization methods.
  • Figure 4: Qualitative comparison of 3D spatial consistency between our method and feed-forward Gaussian Splatting methods.
  • Figure 5: Qualitative comparison of rendering quality under different ablation settings
  • ...and 1 more figures