Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood Diseases
Yifei Chen, Chenyan Zhang, Ben Chen, Yiyu Huang, Yifei Sun, Changmiao Wang, Xianjun Fu, Yuxing Dai, Feiwei Qin, Yong Peng, Yu Gao
TL;DR
Leukocyte detection in blood smear images is challenged by scale variation and sparse discriminative features, complicating automated diagnosis. The authors present MFDS-DETR, a DETR-based framework that combines a ResNet-50–based backbone, a High-level Screening-feature Pyramid Network (HS-FPN), and a deformable self-attention encoder with a DETR-style decoder, optimized via a joint loss that includes $L_{class}$, $L_{box}$ (comprising $L_{GIoU}$ and $L_1$) and an auxiliary component. Across WBCDD, LISC, and BCCD datasets, MFDS-DETR achieves state-of-the-art performance, substantially outperforming traditional two-stage and single-stage detectors, with notable gains for challenging classes such as eosinophils and lymphocytes. The work also introduces the WBCCD dataset and provides code on GitHub, highlighting strong generalization and potential for aiding hematologic diagnosis in clinical settings. Limitations include dependence on dataset size and quality, motivating future work toward larger, more diverse data collections and continued architectural enhancements.
Abstract
In standard hospital blood tests, the traditional process requires doctors to manually isolate leukocytes from microscopic images of patients' blood using microscopes. These isolated leukocytes are then categorized via automatic leukocyte classifiers to determine the proportion and volume of different types of leukocytes present in the blood samples, aiding disease diagnosis. This methodology is not only time-consuming and labor-intensive, but it also has a high propensity for errors due to factors such as image quality and environmental conditions, which could potentially lead to incorrect subsequent classifications and misdiagnosis. To address these issues, this paper proposes an innovative method of leukocyte detection: the Multi-level Feature Fusion and Deformable Self-attention DETR (MFDS-DETR). To tackle the issue of leukocyte scale disparity, we designed the High-level Screening-feature Fusion Pyramid (HS-FPN), enabling multi-level fusion. This model uses high-level features as weights to filter low-level feature information via a channel attention module and then merges the screened information with the high-level features, thus enhancing the model's feature expression capability. Further, we address the issue of leukocyte feature scarcity by incorporating a multi-scale deformable self-attention module in the encoder and using the self-attention and cross-deformable attention mechanisms in the decoder, which aids in the extraction of the global features of the leukocyte feature maps. The effectiveness, superiority, and generalizability of the proposed MFDS-DETR method are confirmed through comparisons with other cutting-edge leukocyte detection models using the private WBCDD, public LISC and BCCD datasets. Our source code and private WBCCD dataset are available at https://github.com/JustlfC03/MFDS-DETR.
