Table of Contents
Fetching ...

An Explainable Non-local Network for COVID-19 Diagnosis

Jingfu Yang, Peng Huang, Jing Hu, Shu Hu, Siwei Lyu, Xin Wang, Jun Guo, Xi Wu

TL;DR

The paper tackles the challenge of accurate and interpretable COVID-19 diagnosis from 3D chest CT scans. It introduces NL-RAN, a 3D residual network that fuses a stacked 3D mixed attention mechanism with a non-local module to jointly capture fine lesion details and global context, while outputting attention heat maps for visualization. Empirical results on the CC-CCII dataset show NL-RAN achieving state-of-the-art performance (AUC ~ $0.9903$, high accuracy and F1) and producing sharper lesion heat maps than CAM-based methods. The approach balances accuracy and interpretability with real-time inference, and its heat maps provide detailed outlines of infection regions, making it practical for clinical decision support and adaptable to other 3D medical imaging tasks.

Abstract

The CNN has achieved excellent results in the automatic classification of medical images. In this study, we propose a novel deep residual 3D attention non-local network (NL-RAN) to classify CT images included COVID-19, common pneumonia, and normal to perform rapid and explainable COVID-19 diagnosis. We built a deep residual 3D attention non-local network that could achieve end-to-end training. The network is embedded with a nonlocal module to capture global information, while a 3D attention module is embedded to focus on the details of the lesion so that it can directly analyze the 3D lung CT and output the classification results. The output of the attention module can be used as a heat map to increase the interpretability of the model. 4079 3D CT scans were included in this study. Each scan had a unique label (novel coronavirus pneumonia, common pneumonia, and normal). The CT scans cohort was randomly split into a training set of 3263 scans, a validation set of 408 scans, and a testing set of 408 scans. And compare with existing mainstream classification methods, such as CovNet, CBAM, ResNet, etc. Simultaneously compare the visualization results with visualization methods such as CAM. Model performance was evaluated using the Area Under the ROC Curve(AUC), precision, and F1-score. The NL-RAN achieved the AUC of 0.9903, the precision of 0.9473, and the F1-score of 0.9462, surpass all the classification methods compared. The heat map output by the attention module is also clearer than the heat map output by CAM. Our experimental results indicate that our proposed method performs significantly better than existing methods. In addition, the first attention module outputs a heat map containing detailed outline information to increase the interpretability of the model. Our experiments indicate that the inference of our model is fast. It can provide real-time assistance with diagnosis.

An Explainable Non-local Network for COVID-19 Diagnosis

TL;DR

The paper tackles the challenge of accurate and interpretable COVID-19 diagnosis from 3D chest CT scans. It introduces NL-RAN, a 3D residual network that fuses a stacked 3D mixed attention mechanism with a non-local module to jointly capture fine lesion details and global context, while outputting attention heat maps for visualization. Empirical results on the CC-CCII dataset show NL-RAN achieving state-of-the-art performance (AUC ~ , high accuracy and F1) and producing sharper lesion heat maps than CAM-based methods. The approach balances accuracy and interpretability with real-time inference, and its heat maps provide detailed outlines of infection regions, making it practical for clinical decision support and adaptable to other 3D medical imaging tasks.

Abstract

The CNN has achieved excellent results in the automatic classification of medical images. In this study, we propose a novel deep residual 3D attention non-local network (NL-RAN) to classify CT images included COVID-19, common pneumonia, and normal to perform rapid and explainable COVID-19 diagnosis. We built a deep residual 3D attention non-local network that could achieve end-to-end training. The network is embedded with a nonlocal module to capture global information, while a 3D attention module is embedded to focus on the details of the lesion so that it can directly analyze the 3D lung CT and output the classification results. The output of the attention module can be used as a heat map to increase the interpretability of the model. 4079 3D CT scans were included in this study. Each scan had a unique label (novel coronavirus pneumonia, common pneumonia, and normal). The CT scans cohort was randomly split into a training set of 3263 scans, a validation set of 408 scans, and a testing set of 408 scans. And compare with existing mainstream classification methods, such as CovNet, CBAM, ResNet, etc. Simultaneously compare the visualization results with visualization methods such as CAM. Model performance was evaluated using the Area Under the ROC Curve(AUC), precision, and F1-score. The NL-RAN achieved the AUC of 0.9903, the precision of 0.9473, and the F1-score of 0.9462, surpass all the classification methods compared. The heat map output by the attention module is also clearer than the heat map output by CAM. Our experimental results indicate that our proposed method performs significantly better than existing methods. In addition, the first attention module outputs a heat map containing detailed outline information to increase the interpretability of the model. Our experiments indicate that the inference of our model is fast. It can provide real-time assistance with diagnosis.
Paper Structure (19 sections, 12 equations, 8 figures, 4 tables)

This paper contains 19 sections, 12 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Complex appearances of pneumonia lesions in CT scans of COVID-19 patients. (a-c) are COVID-19, common pneumonia, and normal, respectively, where red arrows highlight some lesions or suspected lesions. There is a high similarity between the lesion area in (b) and the suspected lesion area in (c). (d-f) are temperature maps inferred by the CAM. (c), (f) and (i) are normal lungs, but some tissues have suspected lesions that are captured by the neural network. (g-i) is the temperature map inferred by our model, which is more accurate in detail and positioning than CAM.
  • Figure 2: The architecture of our proposed NL-RAN, which consists of three mixed attention modules connected to the ResNet. After the last 3D attention module, a non-local module is embedded to enhance the receptive field, so that the model pays attention to both the details and the global information.
  • Figure 3: The structure of the non-local module. It collects global information by weighted summation of the features from all positions to the target position, where the connecting weight is calculated by the pairwise relation.
  • Figure 4: The architecture of our proposed 3D attention module. It takes the feature map that has been activated by the first attention module as the attention map and outputs a heat map.
  • Figure 5: Comparison of six methods on three-class ROC and PR curves. The larger the area enclosed by the curve, the better the performance of the model. The classification of common pneumonia (CP) is in the left column, COVID-19 (NCP) in the middle column, and normal in the right column.
  • ...and 3 more figures