MetaChest: Generalized few-shot learning of pathologies from chest X-rays
Berenice Montalvo-Lezama, Gibran Fuentes-Pineda
TL;DR
The paper tackles the scarcity of annotated chest X-ray data by adopting generalized few-shot learning to classify multiple pathologies with both seen and unseen labels. It introduces MetaChest, a large, multi-source chest X-ray dataset with a dedicated meta-learning partition and a multi-label episode generator, enabling robust GFSL evaluation. Through experiments comparing standard transfer learning (BatchBased) and a multi-label ProtoNet extension (ProtoNet-ML), the study shows that higher episode complexity (more classes per episode) and more per-class examples improve performance, with transfer learning approaches often outperforming meta-learners in this domain. The work highlights the importance of higher-resolution images and efficient architectures for medical tasks, and suggests practical deployment benefits and directions for future research, including vision foundation models and multimodal integration.
Abstract
The limited availability of annotated data presents a major challenge for applying deep learning methods to medical image analysis. Few-shot learning methods aim to recognize new classes from only a small number of labeled examples. These methods are typically studied under the standard few-shot learning setting, where all classes in a task are new. However, medical applications such as pathology classification from chest X-rays often require learning new classes while simultaneously leveraging knowledge of previously known ones, a scenario more closely aligned with generalized few-shot classification. Despite its practical relevance, few-shot learning has been scarcely studied in this context. In this work, we present MetaChest, a large-scale dataset of 479,215 chest X-rays collected from four public databases. MetaChest includes a meta-set partition specifically designed for standard few-shot classification, as well as an algorithm for generating multi-label episodes. We conduct extensive experiments evaluating both a standard transfer learning approach and an extension of ProtoNet across a wide range of few-shot multi-label classification tasks. Our results demonstrate that increasing the number of classes per episode and the number of training examples per class improves classification performance. Notably, the transfer learning approach consistently outperforms the ProtoNet extension, despite not being tailored for few-shot learning. We also show that higher-resolution images improve accuracy at the cost of additional computation, while efficient model architectures achieve comparable performance to larger models with significantly reduced resource requirements.
