An Ensemble-Based Two-Step Framework for Classification of Pap Smear Cell Images
Theo Di Piazza, Loic Boussel
TL;DR
This work tackles automated Pap smear image classification to aid cervical cancer screening by proposing a two-stage ensemble framework. The pipeline first detects diagnostically rubbish images and then classifies non-rubbish images as healthy or unhealthy (with the possibility of both) using multiple pretrained backbones (CNNs and Vision Transformers) and probability averaging across folds. Trained and evaluated on the APACC dataset with 5-fold cross-validation, the ensemble achieves superior macro-F1 and AUROC compared to individual models, demonstrating robust performance under class imbalance and image artifacts. The approach offers a practical, scalable tool to assist cytologists and motivates future enhancements through boosting or meta-learning for optimal model fusion.
Abstract
Early detection of cervical cancer is crucial for improving patient outcomes and reducing mortality by identifying precancerous lesions as soon as possible. As a result, the use of pap smear screening has significantly increased, leading to a growing demand for automated tools that can assist cytologists managing their rising workload. To address this, the Pap Smear Cell Classification Challenge (PS3C) has been organized in association with ISBI in 2025. This project aims to promote the development of automated tools for pap smear images classification. The analyzed images are grouped into four categories: healthy, unhealthy, both, and rubbish images which are considered as unsuitable for diagnosis. In this work, we propose a two-stage ensemble approach: first, a neural network determines whether an image is rubbish or not. If not, a second neural network classifies the image as containing a healthy cell, an unhealthy cell, or both.
