Robust COVID-19 Detection in CT Images with CLIP

Li Lin; Yamini Sri Krubha; Zhenhuan Yang; Cheng Ren; Thuc Duy Le; Irene Amerini; Xin Wang; Shu Hu

Robust COVID-19 Detection in CT Images with CLIP

Li Lin, Yamini Sri Krubha, Zhenhuan Yang, Cheng Ren, Thuc Duy Le, Irene Amerini, Xin Wang, Shu Hu

TL;DR

This work introduces the first lightweight detector designed to overcome obstacles in medical imaging, leveraging a frozen CLIP image encoder and a trainable multilayer perception (MLP) to achieve superior performance despite the inherent data limitations.

Abstract

In the realm of medical imaging, particularly for COVID-19 detection, deep learning models face substantial challenges such as the necessity for extensive computational resources, the paucity of well-annotated datasets, and a significant amount of unlabeled data. In this work, we introduce the first lightweight detector designed to overcome these obstacles, leveraging a frozen CLIP image encoder and a trainable multilayer perception (MLP). Enhanced with Conditional Value at Risk (CVaR) for robustness and a loss landscape flattening strategy for improved generalization, our model is tailored for high efficacy in COVID-19 detection. Furthermore, we integrate a teacher-student framework to capitalize on the vast amounts of unlabeled data, enabling our model to achieve superior performance despite the inherent data limitations. Experimental results on the COV19-CT-DB dataset demonstrate the effectiveness of our approach, surpassing baseline by up to 10.6% in `macro' F1 score in supervised learning. The code is available at https://github.com/Purdue-M2/COVID-19_Detection_M2_PURDUE.

Robust COVID-19 Detection in CT Images with CLIP

TL;DR

Abstract

Paper Structure (18 sections, 3 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 18 sections, 3 equations, 5 figures, 2 tables, 1 algorithm.

Introduction
Related Work
COVID-19 Detection
CLIP
Method
Supervised Learning
Semi-Supervised Learning
Optimization
Experiments
Experimental Settings
Datasets
Evaluation Metrics
Baseline Methods
Implementation Details
Results
...and 3 more sections

Figures (5)

Figure 1: Comparison between our method with traditional method. First row: The traditional method trains a whole deep learning model ( e.g., CNN) with a binary cross-entropy loss $\mathcal{L}_{BCE}$. Second row: Our method enhances COVID-19 detection by unitizing a frozen CLIP and a lightweight MLP classifier with Conditional Value at Risk (CVaR) loss $\mathcal{L}_{CVaR}$ across a flattened loss landscape.
Figure 2: Overview of our proposed model using CLIP ViT for encoding the input images, an MLP module with robust CVaR loss, and an optimization step involving a flattened loss landscape for detecting COVID-19 cases apart from NON-COVID-19.
Figure 3: Diagrammatic representation of our robust model with teacher-student framework by leveraging unlabeled data for enhancing detection performance.
Figure 4: The loss landscape visualization of our proposed method without (left) and with (right) using the sharpness-aware minimization (SAM) method. The axis's scales are the same for both figures.
Figure 5: 'Macro' F1 score to different $\alpha$ values.

Robust COVID-19 Detection in CT Images with CLIP

TL;DR

Abstract

Robust COVID-19 Detection in CT Images with CLIP

Authors

TL;DR

Abstract

Table of Contents

Figures (5)