Multi-Stage Fine-Tuning of Pathology Foundation Models with Head-Diverse Ensembling for White Blood Cell Classification

Antony Gitau; Martin Paulson; Bjørn-Jostein Singstad; Karl Thomas Hjelmervik; Ola Marius Lysaker; Veralia Gabriela Sanchez

Multi-Stage Fine-Tuning of Pathology Foundation Models with Head-Diverse Ensembling for White Blood Cell Classification

Antony Gitau, Martin Paulson, Bjørn-Jostein Singstad, Karl Thomas Hjelmervik, Ola Marius Lysaker, Veralia Gabriela Sanchez

Abstract

The classification of white blood cells (WBCs) from peripheral blood smears is critical for the diagnosis of leukemia. However, automated approaches still struggle due to challenges including class imbalance, domain shift, and morphological continuum confusion, where adjacent maturation stages exhibit subtle, overlapping features. We present a multi-stage fine-tuning methodology for 13-class WBC classification in the WBCBench 2026 Challenge (ISBI 2026). Our best-performing model is a fine-tuned DINOBloom-base, on which we train multiple classifier head families (linear, cosine, and multilayer perceptron (MLP)). The cosine head performed best on the mature granulocyte boundary (Band neutrophil (BNE) F1 = 0.470), the linear head on more immature granulocyte classes (Metamyelocyte (MMY) F1 = 0.585), and the MLP head on the most immature granulocyte (Promyelocyte (PMY) F1 = 0.733), revealing class-specific specialization. Based on this specialization, we construct a head-diverse ensemble, where the MLP head acts as the primary predictor, and its predictions within the four predefined confusion pairs are replaced only when two other head families agree. We further show that cases consistently misclassified by all models are substantially enriched for probable labeling errors or inherent morphological ambiguity.

Multi-Stage Fine-Tuning of Pathology Foundation Models with Head-Diverse Ensembling for White Blood Cell Classification

Abstract

Paper Structure (11 sections, 7 equations, 7 figures, 1 table)

This paper contains 11 sections, 7 equations, 7 figures, 1 table.

Introduction
Materials and Methods
Dataset
Backbone Models and Classification Heads
Multi-Stage Fine-tuning
Head-Diverse Ensembling
Expert Review of Disagreement Cases
Results
Expert Label Review Results
Discussion and Conclusion
Compliance with Ethical Standards

Figures (7)

Figure 1: An illustration of the morphological continuum from promyelocyte (PMY), to myelocyte (MY), metamyelocyte (MMY), band-form neutrophil (BNE) and segmented neutrophil (SNE)
Figure 2: End-to-end fine-tuning and inference simplified visual illustration. During training, separate models are obtained by fine-tuning a pathology foundation model (DINOBloom) with different classifier heads (linear, cosine, and MLP) across staged optimization. During inference, saved full DinoBloom-base and head checkpoints are combined using a head-diverse ensemble, where an MLP head acts as the primary predictor and is conditionally overridden by agreement between auxiliary heads.
Figure 3: Distributions of white blood cell types in the training set.
Figure 4: Comparison of various of our models' performances on the validation and test sets
Figure 5: Comparison of classification head performance for the boundary subsets.
...and 2 more figures

Multi-Stage Fine-Tuning of Pathology Foundation Models with Head-Diverse Ensembling for White Blood Cell Classification

Abstract

Multi-Stage Fine-Tuning of Pathology Foundation Models with Head-Diverse Ensembling for White Blood Cell Classification

Authors

Abstract

Table of Contents

Figures (7)