Explaining 3D Computed Tomography Classifiers with Counterfactuals
Joseph Paul Cohen, Louis Blankemeier, Akshay Chaudhari
TL;DR
This work tackles the difficulty of explaining 3D CT classifiers with counterfactuals by extending the Latent Shift approach to volumetric data. It introduces a slice-based autoencoder (Slice AE) to enable gradient-based CF generation on high-resolution CT volumes while keeping memory use practical, using a 2D encoder to preserve 3D context through slice concatenation. A VQ-GAN trained on over 1.4 million CT slices enables plausible latent-space manipulations, and the method is demonstrated on clinical phenotype prediction and lung segmentation, with qualitative and quantitative validation on multiple public datasets. The approach yields localized, clinically meaningful counterfactuals, improves interpretability of high-stakes medical AI, and is publicly released to support future research and auditing.
Abstract
Counterfactual explanations enhance the interpretability of deep learning models in medical imaging, yet adapting them to 3D CT scans poses challenges due to volumetric complexity and resource demands. We extend the Latent Shift counterfactual generation method from 2D applications to explain 3D computed tomography (CT) scans classifiers. We address the challenges associated with 3D classifiers, such as limited training samples and high memory demands, by implementing a slice-based autoencoder and gradient blocking except for specific chunks of slices. This method leverages a 2D encoder trained on CT slices, which are subsequently combined to maintain 3D context. We demonstrate this technique on two models for clinical phenotype prediction and lung segmentation. Our approach is both memory-efficient and effective for generating interpretable counterfactuals in high-resolution 3D medical imaging.
