Table of Contents
Fetching ...

Dehallu3D: Hallucination-Mitigated 3D Generation from Single Image via Cyclic View Consistency Refinement

Xiwen Wang, Shichao Zhang, Hailun Zhang, Ruowei Wang, Mao Li, Chenyu Zhou, Qijun Zhao, Ji-Zhe Zhou

TL;DR

Dehallu3D achieves high-fidelity 3D generation by effectively preserving structural details while removing hallucinated outliers, and the Outlier Risk Measure (ORM) metric is proposed to quantify geometric fidelity in 3D generation from the perspective of outliers.

Abstract

Large 3D reconstruction models have revolutionized the 3D content generation field, enabling broad applications in virtual reality and gaming. Just like other large models, large 3D reconstruction models suffer from hallucinations as well, introducing structural outliers (e.g., odd holes or protrusions) that deviate from the input data. However, unlike other large models, hallucinations in large 3D reconstruction models remain severely underexplored, leading to malformed 3D-printed objects or insufficient immersion in virtual scenes. Such hallucinations majorly originate from that existing methods reconstruct 3D content from sparsely generated multi-view images which suffer from large viewpoint gaps and discontinuities. To mitigate hallucinations by eliminating the outliers, we propose Dehallu3D for 3D mesh generation. Our key idea is to design a balanced multi-view continuity constraint to enforce smooth transitions across dense intermediate viewpoints, while avoiding over-smoothing that could erase sharp geometric features. Therefore, Dehallu3D employs a plug-and-play optimization module with two key constraints: (i) adjacent consistency to ensure geometric continuity across views, and (ii) adaptive smoothness to retain fine details.We further propose the Outlier Risk Measure (ORM) metric to quantify geometric fidelity in 3D generation from the perspective of outliers. Extensive experiments show that Dehallu3D achieves high-fidelity 3D generation by effectively preserving structural details while removing hallucinated outliers.

Dehallu3D: Hallucination-Mitigated 3D Generation from Single Image via Cyclic View Consistency Refinement

TL;DR

Dehallu3D achieves high-fidelity 3D generation by effectively preserving structural details while removing hallucinated outliers, and the Outlier Risk Measure (ORM) metric is proposed to quantify geometric fidelity in 3D generation from the perspective of outliers.

Abstract

Large 3D reconstruction models have revolutionized the 3D content generation field, enabling broad applications in virtual reality and gaming. Just like other large models, large 3D reconstruction models suffer from hallucinations as well, introducing structural outliers (e.g., odd holes or protrusions) that deviate from the input data. However, unlike other large models, hallucinations in large 3D reconstruction models remain severely underexplored, leading to malformed 3D-printed objects or insufficient immersion in virtual scenes. Such hallucinations majorly originate from that existing methods reconstruct 3D content from sparsely generated multi-view images which suffer from large viewpoint gaps and discontinuities. To mitigate hallucinations by eliminating the outliers, we propose Dehallu3D for 3D mesh generation. Our key idea is to design a balanced multi-view continuity constraint to enforce smooth transitions across dense intermediate viewpoints, while avoiding over-smoothing that could erase sharp geometric features. Therefore, Dehallu3D employs a plug-and-play optimization module with two key constraints: (i) adjacent consistency to ensure geometric continuity across views, and (ii) adaptive smoothness to retain fine details.We further propose the Outlier Risk Measure (ORM) metric to quantify geometric fidelity in 3D generation from the perspective of outliers. Extensive experiments show that Dehallu3D achieves high-fidelity 3D generation by effectively preserving structural details while removing hallucinated outliers.
Paper Structure (18 sections, 9 equations, 6 figures, 3 tables)

This paper contains 18 sections, 9 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Our Dehallu3D generates high-quality and high-fidelity 3D meshes, effectively mitigating mesh outliers while adaptively preserving sharp features. The red and blue boxes highlight noticeable outlier regions in the Baseline method, contrasted with the corresponding regions in Dehallu3D (Ours) to showcase improvements.
  • Figure 2: Overview of Dehallu3D. Dehallu3D first generate orthographic multi-view color images and corresponding normal maps, which are used to initialize a coarse mesh. Next, a globally plausible mesh is quickly constructed through the Coarse Mesh Reconstruction stage. Finally, the proposed Cyclic View Consistency Refinement (CVCR) module is employed to mitigate outliers and further refine the mesh.
  • Figure 3: Qualitative comparison of different methods in mesh reconstruction. The red and blue boxes mark specific regions in the meshes of other methods for comparison with the corresponding optimized regions in the mesh generated by Dehallu3D, while red circles highlight defects in meshes.
  • Figure 4: Qualitative comparison results for ablation study on $\mathcal{L}_{DC}$ and $\mathcal{L}_{DS}$ in CVCR module.
  • Figure 5: Comparison of ORM results across all methods.
  • ...and 1 more figures