Exploration of Interpretability Techniques for Deep COVID-19 Classification using Chest X-ray Images
Soumick Chatterjee, Fatima Saad, Chompunuch Sarasaen, Suhita Ghosh, Valerie Krug, Rupali Khatun, Rahul Mishra, Nirja Desai, Petia Radeva, Georg Rose, Sebastian Stober, Oliver Speck, Andreas Nürnberger
TL;DR
The paper evaluates interpretability of five pretrained CNNs (ResNet18, ResNet34, InceptionV3, InceptionResNetV2, DenseNet161) and a majority-voting ensemble for multiclass and multilabel COVID-19-related chest X-ray classification. It applies local interpretability methods—occlusion, saliency, input X gradient, guided backpropagation, integrated gradients, and DeepLIFT—and a global technique called Neuron Activation Profiles (NAPs) to compare model reasoning. Mean Micro-F1 scores for COVID-19 classifications range from $0.66$ to $0.875$, with the ensemble achieving $0.89$, while DenseNet161 attains the top accuracy but exhibits many dead neurons and attention to non-biological regions; ResNet18 offers strong performance with clearer lesion-focused interpretability. The study shows that interpretability analyses can guide clinical deployment by revealing when higher accuracy comes at the cost of biologically irrelevant features, highlighting the trade-offs between explainability and performance in chest X-ray COVID-19 screening.
Abstract
The outbreak of COVID-19 has shocked the entire world with its fairly rapid spread and has challenged different sectors. One of the most effective ways to limit its spread is the early and accurate diagnosing infected patients. Medical imaging, such as X-ray and Computed Tomography (CT), combined with the potential of Artificial Intelligence (AI), plays an essential role in supporting medical personnel in the diagnosis process. Thus, in this article five different deep learning models (ResNet18, ResNet34, InceptionV3, InceptionResNetV2 and DenseNet161) and their ensemble, using majority voting have been used to classify COVID-19, pneumoniæ and healthy subjects using chest X-ray images. Multilabel classification was performed to predict multiple pathologies for each patient, if present. Firstly, the interpretability of each of the networks was thoroughly studied using local interpretability methods - occlusion, saliency, input X gradient, guided backpropagation, integrated gradients, and DeepLIFT, and using a global technique - neuron activation profiles. The mean Micro-F1 score of the models for COVID-19 classifications ranges from 0.66 to 0.875, and is 0.89 for the ensemble of the network models. The qualitative results showed that the ResNets were the most interpretable models. This research demonstrates the importance of using interpretability methods to compare different models before making a decision regarding the best performing model.
