Table of Contents
Fetching ...

Ensemble of ConvNeXt V2 and MaxViT for Long-Tailed CXR Classification with View-Based Aggregation

Yosuke Yamagishi, Shouhei Hanaoka

TL;DR

The proposed method combines state-of-the-art image classification techniques, asymmetric loss for handling class imbalance, and view-based prediction aggregation to enhance classification performance.

Abstract

In this work, we present our solution for the MICCAI 2024 CXR-LT challenge, achieving 4th place in Subtask 2 and 5th in Subtask 1. We leveraged an ensemble of ConvNeXt V2 and MaxViT models, pretrained on an external chest X-ray dataset, to address the long-tailed distribution of chest findings. The proposed method combines state-of-the-art image classification techniques, asymmetric loss for handling class imbalance, and view-based prediction aggregation to enhance classification performance. Through experiments, we demonstrate the advantages of our approach in improving both detection accuracy and the handling of the long-tailed distribution in CXR findings. The code is available at https://github.com/yamagishi0824/cxrlt24-multiview-pp.

Ensemble of ConvNeXt V2 and MaxViT for Long-Tailed CXR Classification with View-Based Aggregation

TL;DR

The proposed method combines state-of-the-art image classification techniques, asymmetric loss for handling class imbalance, and view-based prediction aggregation to enhance classification performance.

Abstract

In this work, we present our solution for the MICCAI 2024 CXR-LT challenge, achieving 4th place in Subtask 2 and 5th in Subtask 1. We leveraged an ensemble of ConvNeXt V2 and MaxViT models, pretrained on an external chest X-ray dataset, to address the long-tailed distribution of chest findings. The proposed method combines state-of-the-art image classification techniques, asymmetric loss for handling class imbalance, and view-based prediction aggregation to enhance classification performance. Through experiments, we demonstrate the advantages of our approach in improving both detection accuracy and the handling of the long-tailed distribution in CXR findings. The code is available at https://github.com/yamagishi0824/cxrlt24-multiview-pp.

Paper Structure

This paper contains 24 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Multi-View Aggregation Process for Chest X-ray Prediction. The diagram illustrates the method of combining predictions from multiple chest X-ray views to produce a final prediction value. Probabilities from frontal view(s) and lateral view(s) are processed through view-specific averaging, followed by weighted averaging across different views to generate the final prediction score.