Table of Contents
Fetching ...

Interpretable Affordance Detection on 3D Point Clouds with Probabilistic Prototypes

Maximilian Xiling Li, Korbinian Rudolf, Nils Blank, Rudolf Lioutikov

TL;DR

This work tackles the need for trustworthy, interpretable affordance detection on 3D point clouds. It introduces probabilistic prototypes integrated into point-cloud segmentation backbones to provide inherent, case-based explanations while maintaining competitive accuracy on the 3D-AffordanceNet benchmark. Key contributions include the first application of probabilistic prototypes to 3D affordance detection, demonstrated performance improvements and rich interpretability visualizations, and extensive ablations on prototype count and backbone choices. The approach holds promise for safer human–robot interaction by making model decisions more transparent and easier to validate in real-world robotic scenarios.

Abstract

Robotic agents need to understand how to interact with objects in their environment, both autonomously and during human-robot interactions. Affordance detection on 3D point clouds, which identifies object regions that allow specific interactions, has traditionally relied on deep learning models like PointNet++, DGCNN, or PointTransformerV3. However, these models operate as black boxes, offering no insight into their decision-making processes. Prototypical Learning methods, such as ProtoPNet, provide an interpretable alternative to black-box models by employing a "this looks like that" case-based reasoning approach. However, they have been primarily applied to image-based tasks. In this work, we apply prototypical learning to models for affordance detection on 3D point clouds. Experiments on the 3D-AffordanceNet benchmark dataset show that prototypical models achieve competitive performance with state-of-the-art black-box models and offer inherent interpretability. This makes prototypical models a promising candidate for human-robot interaction scenarios that require increased trust and safety.

Interpretable Affordance Detection on 3D Point Clouds with Probabilistic Prototypes

TL;DR

This work tackles the need for trustworthy, interpretable affordance detection on 3D point clouds. It introduces probabilistic prototypes integrated into point-cloud segmentation backbones to provide inherent, case-based explanations while maintaining competitive accuracy on the 3D-AffordanceNet benchmark. Key contributions include the first application of probabilistic prototypes to 3D affordance detection, demonstrated performance improvements and rich interpretability visualizations, and extensive ablations on prototype count and backbone choices. The approach holds promise for safer human–robot interaction by making model decisions more transparent and easier to validate in real-world robotic scenarios.

Abstract

Robotic agents need to understand how to interact with objects in their environment, both autonomously and during human-robot interactions. Affordance detection on 3D point clouds, which identifies object regions that allow specific interactions, has traditionally relied on deep learning models like PointNet++, DGCNN, or PointTransformerV3. However, these models operate as black boxes, offering no insight into their decision-making processes. Prototypical Learning methods, such as ProtoPNet, provide an interpretable alternative to black-box models by employing a "this looks like that" case-based reasoning approach. However, they have been primarily applied to image-based tasks. In this work, we apply prototypical learning to models for affordance detection on 3D point clouds. Experiments on the 3D-AffordanceNet benchmark dataset show that prototypical models achieve competitive performance with state-of-the-art black-box models and offer inherent interpretability. This makes prototypical models a promising candidate for human-robot interaction scenarios that require increased trust and safety.

Paper Structure

This paper contains 18 sections, 3 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Prototypes provide explanations by displaying the similarities of learned representations to image features.
  • Figure 2: Architecture Overview of extended point cloud processing model: A point cloud segmentation backbone, e.g., PointNet++ or DGCNN, produces a feature map. A prototype layer computes prototype similarity scores from which the classification head generates the segmentation maps.
  • Figure 3: Illustration of the probabilistic prototypes on the hypersphere Li2024HyperpgPrototypicalGaussians.
  • Figure 4: Examples for Affordance Prediction using the prototypical model with PointNet++ backbone. The point color indicates the affordance label and the predicted probability.
  • Figure 5: Prototypes provide insights into a model's reasoning by highlighting point cloud segments with high activations for new inputs and providing similar activation regions for known samples from the training data.
  • ...and 1 more figures