Table of Contents
Fetching ...

QAL: A Loss for Recall Precision Balance in 3D Reconstruction

Pranay Meshram, Yash Turkar, Kartikeya Singh, Praveen Raj Masilamani, Charuvahan Adhivarahan, Karthik Dantu

TL;DR

QAL addresses a critical gap in 3D vision losses by explicitly balancing recall and precision through a coverage-weighted matching term and an uncovered-GT attraction term, formulated as $L_{\mathrm{QAL}} = L_{\mathrm{cov}} + \lambda_{\mathrm{attr}} L_{\mathrm{attr}}$. It acts as a drop-in replacement for Chamfer Distance and Earth Mover’s Distance, improving surface coverage and reducing holes while managing spurious predictions. Across point-cloud completion, single-view reconstruction, and image-to-mesh pipelines, QAL yields consistent coverage gains (averaging +4.3 points) and enhances downstream grasping performance, with robustness to resolution and architecture. The method aligns training with thresholded metrics like Cov@$\tau$ and SP@$\tau$, offering a practical, interpretable objective for robust 3D vision and safety-critical robotics, and is accompanied by efficient GPU implementations for wide adoption.

Abstract

Volumetric learning underpins many 3D vision tasks such as completion, reconstruction, and mesh generation, yet training objectives still rely on Chamfer Distance (CD) or Earth Mover's Distance (EMD), which fail to balance recall and precision. We propose Quality-Aware Loss (QAL), a drop-in replacement for CD/EMD that combines a coverage-weighted nearest-neighbor term with an uncovered-ground-truth attraction term, explicitly decoupling recall and precision into tunable components. Across diverse pipelines, QAL achieves consistent coverage gains, improving by an average of +4.3 pts over CD and +2.8 pts over the best alternatives. Though modest in percentage, these improvements reliably recover thin structures and under-represented regions that CD/EMD overlook. Extensive ablations confirm stable performance across hyperparameters and across output resolutions, while full retraining on PCN and ShapeNet demonstrates generalization across datasets and backbones. Moreover, QAL-trained completions yield higher grasp scores under GraspNet evaluation, showing that improved coverage translates directly into more reliable robotic manipulation. QAL thus offers a principled, interpretable, and practical objective for robust 3D vision and safety-critical robotics pipelines

QAL: A Loss for Recall Precision Balance in 3D Reconstruction

TL;DR

QAL addresses a critical gap in 3D vision losses by explicitly balancing recall and precision through a coverage-weighted matching term and an uncovered-GT attraction term, formulated as . It acts as a drop-in replacement for Chamfer Distance and Earth Mover’s Distance, improving surface coverage and reducing holes while managing spurious predictions. Across point-cloud completion, single-view reconstruction, and image-to-mesh pipelines, QAL yields consistent coverage gains (averaging +4.3 points) and enhances downstream grasping performance, with robustness to resolution and architecture. The method aligns training with thresholded metrics like Cov@ and SP@, offering a practical, interpretable objective for robust 3D vision and safety-critical robotics, and is accompanied by efficient GPU implementations for wide adoption.

Abstract

Volumetric learning underpins many 3D vision tasks such as completion, reconstruction, and mesh generation, yet training objectives still rely on Chamfer Distance (CD) or Earth Mover's Distance (EMD), which fail to balance recall and precision. We propose Quality-Aware Loss (QAL), a drop-in replacement for CD/EMD that combines a coverage-weighted nearest-neighbor term with an uncovered-ground-truth attraction term, explicitly decoupling recall and precision into tunable components. Across diverse pipelines, QAL achieves consistent coverage gains, improving by an average of +4.3 pts over CD and +2.8 pts over the best alternatives. Though modest in percentage, these improvements reliably recover thin structures and under-represented regions that CD/EMD overlook. Extensive ablations confirm stable performance across hyperparameters and across output resolutions, while full retraining on PCN and ShapeNet demonstrates generalization across datasets and backbones. Moreover, QAL-trained completions yield higher grasp scores under GraspNet evaluation, showing that improved coverage translates directly into more reliable robotic manipulation. QAL thus offers a principled, interpretable, and practical objective for robust 3D vision and safety-critical robotics pipelines

Paper Structure

This paper contains 18 sections, 6 equations, 12 figures, 8 tables, 1 algorithm.

Figures (12)

  • Figure 1: Point cloud completion with ECG pan_ecg_2020 on MVPpan2021variational; Input partial clouds are omitted for space; all methods use identical inputs.
  • Figure 2: Illustration of QAL and evaluation metrics. (a) QAL components: GT points (blue squares) with $\epsilon$-balls; coverage links (orange) connect GT$\rightarrow$prediction matches, while attraction links (purple) pull predictions toward uncovered GT regions. (b) Evaluation metrics: predictions inside $\epsilon$-balls are counted toward thresholded coverage, while those outside (gray) are spurious points.
  • Figure 3: QAL hyperparameter ablation. Each panel shows grouped bars for Coverage (Cov; $\uparrow$), Spurious Points ($\overline{\mathrm{SP}}$; $\uparrow$), and Chamfer Distance (CD$\times10^{3}$; $\downarrow$). Left:$\epsilon$ sweep with $\omega\!\approx\!10$, $\lambda_{\mathrm{attr}}\!\approx\!1.0$. Middle:$\omega$ sweep at the selected $\epsilon^{\star}$. Right:$\lambda_{\mathrm{attr}}$ sweep at $(\epsilon^{\star},\omega^{\star})$.
  • Figure 4: Qualitative results for point cloud completion for samples trained on Topnet, PCN, and ECG (Top to bottom) with four different loss functions CD, HCD, InfoCD, and QAL. The highlighted column with QAL learned more intricate details which are missed in models trained by other loss. These results are on the validation set after 100 epochs of training with $\epsilon=0.001$, $\omega=10.0$ and $\lambda_{attr}=1.0$
  • Figure 5: Per-category metric scores comparison for ECG model trained with three loss functions: CD, EMD, and QAL. Metric comparison for (a) $CD_t$ and (b) Quality: equally weighted average of Cov, $\overline{SP}{}$. The categories are divided based on the complexity of the object structure.
  • ...and 7 more figures