Table of Contents
Fetching ...

FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors

Shuai Liu, Boyang Li, Zhiyu Fang, Mingyue Cui, Kai Huang

TL;DR

This work tackles the interpretability gap in LiDAR-based 3D detectors by introducing FFAM, a visualization framework that builds a global concept activation map from 3D backbone features using non-negative matrix factorization and refines it with object-specific gradients to produce per-detection saliency maps. A voxel upsampling strategy aligns the activation map with the sparse input point cloud, enabling accurate, object-focused explanations. Extensive experiments on KITTI and Waymo with detectors like SECOND and CenterPoint show FFAM outperforming prior methods (including OccAM and image-based saliency approaches) across qualitative visualizations and quantitative metrics such as Deletion, Insertion, VEA, and PG. The approach contributes a practical, scalable tool for interpreting 3D detectors, aiding debugging, trust, and model improvement, with code released at the provided GitHub URL.

Abstract

LiDAR-based 3D object detection has made impressive progress recently, yet most existing models are black-box, lacking interpretability. Previous explanation approaches primarily focus on analyzing image-based models and are not readily applicable to LiDAR-based 3D detectors. In this paper, we propose a feature factorization activation map (FFAM) to generate high-quality visual explanations for 3D detectors. FFAM employs non-negative matrix factorization to generate concept activation maps and subsequently aggregates these maps to obtain a global visual explanation. To achieve object-specific visual explanations, we refine the global visual explanation using the feature gradient of a target object. Additionally, we introduce a voxel upsampling strategy to align the scale between the activation map and input point cloud. We qualitatively and quantitatively analyze FFAM with multiple detectors on several datasets. Experimental results validate the high-quality visual explanations produced by FFAM. The Code will be available at \url{https://github.com/Say2L/FFAM.git}.

FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors

TL;DR

This work tackles the interpretability gap in LiDAR-based 3D detectors by introducing FFAM, a visualization framework that builds a global concept activation map from 3D backbone features using non-negative matrix factorization and refines it with object-specific gradients to produce per-detection saliency maps. A voxel upsampling strategy aligns the activation map with the sparse input point cloud, enabling accurate, object-focused explanations. Extensive experiments on KITTI and Waymo with detectors like SECOND and CenterPoint show FFAM outperforming prior methods (including OccAM and image-based saliency approaches) across qualitative visualizations and quantitative metrics such as Deletion, Insertion, VEA, and PG. The approach contributes a practical, scalable tool for interpreting 3D detectors, aiding debugging, trust, and model improvement, with code released at the provided GitHub URL.

Abstract

LiDAR-based 3D object detection has made impressive progress recently, yet most existing models are black-box, lacking interpretability. Previous explanation approaches primarily focus on analyzing image-based models and are not readily applicable to LiDAR-based 3D detectors. In this paper, we propose a feature factorization activation map (FFAM) to generate high-quality visual explanations for 3D detectors. FFAM employs non-negative matrix factorization to generate concept activation maps and subsequently aggregates these maps to obtain a global visual explanation. To achieve object-specific visual explanations, we refine the global visual explanation using the feature gradient of a target object. Additionally, we introduce a voxel upsampling strategy to align the scale between the activation map and input point cloud. We qualitatively and quantitatively analyze FFAM with multiple detectors on several datasets. Experimental results validate the high-quality visual explanations produced by FFAM. The Code will be available at \url{https://github.com/Say2L/FFAM.git}.
Paper Structure (16 sections, 10 equations, 8 figures, 9 tables)

This paper contains 16 sections, 10 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Visualization of FFAM outputs. (a) global concept activation map and (b) object-specific activation map.
  • Figure 2: Overall framework of our FFAM which can generate an object-specific saliency map for a detection $d_i$.
  • Figure 3: Saliency maps for SECOND second and CenterPoint centerpoint. The green bounding boxes indicate the detected objects, while warmer colors (using the turbo colormap) represent higher point contributions to these detections. The crops are provided for visualization purposes only.
  • Figure 4: Average saliency maps for different object attributes. $(x,y,z)$ denotes the center of predicted object. $l$, $w$, $h$, $r$ and $s$ represent the length, width, height, rotation angle and classification score of predicted object, respectively. $d$ indicates the combination of all attributes.
  • Figure 5: AUC diagrams for Deletion and Insertion. Average IoU vs. (a) Deletion steps and (b) Insertion steps.
  • ...and 3 more figures