Table of Contents
Fetching ...

Textured Geometry Evaluation: Perceptual 3D Textured Shape Metric via 3D Latent-Geometry Network

Tianyu Luan, Xuelu Feng, Zixin Zhu, Phani Nuney, Sheng Liu, Xuan Gong, David Doermann, Chunming Qiao, Junsong Yuan

TL;DR

Textured Geometry Evaluation (TGE) introduces a rendering-free 3D fidelity metric for textured meshes that operates directly on geometry and vertex colors. It employs a Latent-Geometry Set Abstraction (LG-SA) with cross-attention to fuse appearance and structure, producing latent representations against a reference to compute a fidelity score. The authors also assemble the Colored Shape Fidelity dataset with real-world distortions and human annotations to train and validate the metric. Experiments show that TGE better correlates with human judgments on real distortions than rendering-based and geometry-only baselines, with favorable computational cost.

Abstract

Textured high-fidelity 3D models are crucial for games, AR/VR, and film, but human-aligned evaluation methods still fall behind despite recent advances in 3D reconstruction and generation. Existing metrics, such as Chamfer Distance, often fail to align with how humans evaluate the fidelity of 3D shapes. Recent learning-based metrics attempt to improve this by relying on rendered images and 2D image quality metrics. However, these approaches face limitations due to incomplete structural coverage and sensitivity to viewpoint choices. Moreover, most methods are trained on synthetic distortions, which differ significantly from real-world distortions, resulting in a domain gap. To address these challenges, we propose a new fidelity evaluation method that is based directly on 3D meshes with texture, without relying on rendering. Our method, named Textured Geometry Evaluation TGE, jointly uses the geometry and color information to calculate the fidelity of the input textured mesh with comparison to a reference colored shape. To train and evaluate our metric, we design a human-annotated dataset with real-world distortions. Experiments show that TGE outperforms rendering-based and geometry-only methods on real-world distortion dataset.

Textured Geometry Evaluation: Perceptual 3D Textured Shape Metric via 3D Latent-Geometry Network

TL;DR

Textured Geometry Evaluation (TGE) introduces a rendering-free 3D fidelity metric for textured meshes that operates directly on geometry and vertex colors. It employs a Latent-Geometry Set Abstraction (LG-SA) with cross-attention to fuse appearance and structure, producing latent representations against a reference to compute a fidelity score. The authors also assemble the Colored Shape Fidelity dataset with real-world distortions and human annotations to train and validate the metric. Experiments show that TGE better correlates with human judgments on real distortions than rendering-based and geometry-only baselines, with favorable computational cost.

Abstract

Textured high-fidelity 3D models are crucial for games, AR/VR, and film, but human-aligned evaluation methods still fall behind despite recent advances in 3D reconstruction and generation. Existing metrics, such as Chamfer Distance, often fail to align with how humans evaluate the fidelity of 3D shapes. Recent learning-based metrics attempt to improve this by relying on rendered images and 2D image quality metrics. However, these approaches face limitations due to incomplete structural coverage and sensitivity to viewpoint choices. Moreover, most methods are trained on synthetic distortions, which differ significantly from real-world distortions, resulting in a domain gap. To address these challenges, we propose a new fidelity evaluation method that is based directly on 3D meshes with texture, without relying on rendering. Our method, named Textured Geometry Evaluation TGE, jointly uses the geometry and color information to calculate the fidelity of the input textured mesh with comparison to a reference colored shape. To train and evaluate our metric, we design a human-annotated dataset with real-world distortions. Experiments show that TGE outperforms rendering-based and geometry-only methods on real-world distortion dataset.

Paper Structure

This paper contains 12 sections, 10 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Comparison between previous rendering-based evaluation and our proposed 3D latent-geometry-based metric. Prior works, such as Graphic-LPIPS, rely on rendering 3D shapes into 2D images and assessing fidelity using image-based quality metrics, which makes this approach sensitive to viewpoint. Such methods produce inconsistent scores depending on rendering conditions (Right). In contrast, our method directly operates on textured 3D meshes, without relying on rendering (Left).
  • Figure 2: (a) Overview of the TGE pipeline. Given a pair of textured 3D meshes (input and reference), we extract hierarchical geometry and color features using a PointNet++-style pipeline. A novel Latent-Geometry Set Abstraction (LG-SA) block is introduced to jointly fuse geometry and color information at each level. The resulting global features from both meshes are compared by a shared MLP to predict a scalar fidelity score. This design allows perceptual fidelity evaluation without any rendering. (b) Illustration of the Latent-Geometry Set Abstraction (LG-SA) block. The module extracts geometry and appearance features via parallel self-attention modules and fuses them using a cross-attention mechanism. Geometry features serve as the query to attend to latent color features.
  • Figure 3: Some examples in our Colored Shape Fidelity dataset: (a) Referenced meshes. (b) Distorted meshes reconstructed or generated from real-world methods.
  • Figure 4: Visualized comparison of our metric vs. the previous metric G-LPIPS. Our metric aligns better with human annotation compared to the previous metric.