Table of Contents
Fetching ...

XSPLAIN: XAI-enabling Splat-based Prototype Learning for Attribute-aware INterpretability

Dominik Galus, Julia Farganus, Tymoteusz Zapala, Mikołaj Czachorowski, Piotr Borycki, Przemysław Spurek, Piotr Syga

TL;DR

XSPLAIN addresses the interpretability gap in 3D Gaussian Splat-based classification by introducing an ante-hoc, prototype-based explanation framework. It uses a voxel-aggregated PointNet backbone and a trainable orthogonal transform to disentangle latent channels while strictly preserving the original decision boundary through classifier compensation. Explanations are grounded in representative training prototypes and localized to active voxel regions, enabling intuitive 'this looks like that' reasoning without sacrificing accuracy. Empirical results across multiple 3DGS benchmarks show competitive performance and enhanced interpretability, corroborated by a user study indicating higher perceived transparency compared with post-hoc baselines.

Abstract

3D Gaussian Splatting (3DGS) has rapidly become a standard for high-fidelity 3D reconstruction, yet its adoption in multiple critical domains is hindered by the lack of interpretability of the generation models as well as classification of the Splats. While explainability methods exist for other 3D representations, like point clouds, they typically rely on ambiguous saliency maps that fail to capture the volumetric coherence of Gaussian primitives. We introduce XSPLAIN, the first ante-hoc, prototype-based interpretability framework designed specifically for 3DGS classification. Our approach leverages a voxel-aggregated PointNet backbone and a novel, invertible orthogonal transformation that disentangles feature channels for interpretability while strictly preserving the original decision boundaries. Explanations are grounded in representative training examples, enabling intuitive ``this looks like that'' reasoning without any degradation in classification performance. A rigorous user study (N=51) demonstrates a decisive preference for our approach: participants selected XSPLAIN explanations 48.4\% of the time as the best, significantly outperforming baselines $(p<0.001)$, showing that XSPLAIN provides transparency and user trust. The source code for this work is available at: https://github.com/Solvro/ml-splat-xai

XSPLAIN: XAI-enabling Splat-based Prototype Learning for Attribute-aware INterpretability

TL;DR

XSPLAIN addresses the interpretability gap in 3D Gaussian Splat-based classification by introducing an ante-hoc, prototype-based explanation framework. It uses a voxel-aggregated PointNet backbone and a trainable orthogonal transform to disentangle latent channels while strictly preserving the original decision boundary through classifier compensation. Explanations are grounded in representative training prototypes and localized to active voxel regions, enabling intuitive 'this looks like that' reasoning without sacrificing accuracy. Empirical results across multiple 3DGS benchmarks show competitive performance and enhanced interpretability, corroborated by a user study indicating higher perceived transparency compared with post-hoc baselines.

Abstract

3D Gaussian Splatting (3DGS) has rapidly become a standard for high-fidelity 3D reconstruction, yet its adoption in multiple critical domains is hindered by the lack of interpretability of the generation models as well as classification of the Splats. While explainability methods exist for other 3D representations, like point clouds, they typically rely on ambiguous saliency maps that fail to capture the volumetric coherence of Gaussian primitives. We introduce XSPLAIN, the first ante-hoc, prototype-based interpretability framework designed specifically for 3DGS classification. Our approach leverages a voxel-aggregated PointNet backbone and a novel, invertible orthogonal transformation that disentangles feature channels for interpretability while strictly preserving the original decision boundaries. Explanations are grounded in representative training examples, enabling intuitive ``this looks like that'' reasoning without any degradation in classification performance. A rigorous user study (N=51) demonstrates a decisive preference for our approach: participants selected XSPLAIN explanations 48.4\% of the time as the best, significantly outperforming baselines , showing that XSPLAIN provides transparency and user trust. The source code for this work is available at: https://github.com/Solvro/ml-splat-xai
Paper Structure (26 sections, 29 equations, 19 figures, 4 tables)

This paper contains 26 sections, 29 equations, 19 figures, 4 tables.

Figures (19)

  • Figure 1: XSPLAIN provides ante-hoc, prototype-based explanations for 3D Gaussian Splat classification. A PointNet-based classifier predicts the object category from Gaussian Splat representations, while identifying the most influential voxel regions that drive the decision. Explanations are generated by retrieving representative training examples that activate similar latent responses in the same regions, enabling intuitive "looks like that" reasoning grounded in both geometry and semantic attributes.
  • Figure 2: Overview of the XSPLAIN architecture A) The classification backbone is a modified PointNet architecture extended by a voxel aggregation layer, producing structured latent representations at the voxel level from Gaussian Splat inputs. B) An attachable disentangling module learns an invertible linear transformation that separates latent channels for interpretability while preserving the original global representation and classification output. C) Explanations are generated by identifyingthe most active disentangled channels, visualizing the corresponding influential voxels, and retrieving representative training examples that exhibit similar channel activations.
  • Figure 3: Left: Selected prototypes for the most active channels with the disentangling module applied. Right: Selected prototypes for the same object without applying the disentangling module.
  • Figure 4: Comparison between PointSHAP, LIME and XSPLAIN explanations
  • Figure 5: XSPLAIN results on samples from the 3D car dataset.
  • ...and 14 more figures