Table of Contents
Fetching ...

Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood Diseases

Yifei Chen, Chenyan Zhang, Ben Chen, Yiyu Huang, Yifei Sun, Changmiao Wang, Xianjun Fu, Yuxing Dai, Feiwei Qin, Yong Peng, Yu Gao

TL;DR

Leukocyte detection in blood smear images is challenged by scale variation and sparse discriminative features, complicating automated diagnosis. The authors present MFDS-DETR, a DETR-based framework that combines a ResNet-50–based backbone, a High-level Screening-feature Pyramid Network (HS-FPN), and a deformable self-attention encoder with a DETR-style decoder, optimized via a joint loss that includes $L_{class}$, $L_{box}$ (comprising $L_{GIoU}$ and $L_1$) and an auxiliary component. Across WBCDD, LISC, and BCCD datasets, MFDS-DETR achieves state-of-the-art performance, substantially outperforming traditional two-stage and single-stage detectors, with notable gains for challenging classes such as eosinophils and lymphocytes. The work also introduces the WBCCD dataset and provides code on GitHub, highlighting strong generalization and potential for aiding hematologic diagnosis in clinical settings. Limitations include dependence on dataset size and quality, motivating future work toward larger, more diverse data collections and continued architectural enhancements.

Abstract

In standard hospital blood tests, the traditional process requires doctors to manually isolate leukocytes from microscopic images of patients' blood using microscopes. These isolated leukocytes are then categorized via automatic leukocyte classifiers to determine the proportion and volume of different types of leukocytes present in the blood samples, aiding disease diagnosis. This methodology is not only time-consuming and labor-intensive, but it also has a high propensity for errors due to factors such as image quality and environmental conditions, which could potentially lead to incorrect subsequent classifications and misdiagnosis. To address these issues, this paper proposes an innovative method of leukocyte detection: the Multi-level Feature Fusion and Deformable Self-attention DETR (MFDS-DETR). To tackle the issue of leukocyte scale disparity, we designed the High-level Screening-feature Fusion Pyramid (HS-FPN), enabling multi-level fusion. This model uses high-level features as weights to filter low-level feature information via a channel attention module and then merges the screened information with the high-level features, thus enhancing the model's feature expression capability. Further, we address the issue of leukocyte feature scarcity by incorporating a multi-scale deformable self-attention module in the encoder and using the self-attention and cross-deformable attention mechanisms in the decoder, which aids in the extraction of the global features of the leukocyte feature maps. The effectiveness, superiority, and generalizability of the proposed MFDS-DETR method are confirmed through comparisons with other cutting-edge leukocyte detection models using the private WBCDD, public LISC and BCCD datasets. Our source code and private WBCCD dataset are available at https://github.com/JustlfC03/MFDS-DETR.

Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood Diseases

TL;DR

Leukocyte detection in blood smear images is challenged by scale variation and sparse discriminative features, complicating automated diagnosis. The authors present MFDS-DETR, a DETR-based framework that combines a ResNet-50–based backbone, a High-level Screening-feature Pyramid Network (HS-FPN), and a deformable self-attention encoder with a DETR-style decoder, optimized via a joint loss that includes , (comprising and ) and an auxiliary component. Across WBCDD, LISC, and BCCD datasets, MFDS-DETR achieves state-of-the-art performance, substantially outperforming traditional two-stage and single-stage detectors, with notable gains for challenging classes such as eosinophils and lymphocytes. The work also introduces the WBCCD dataset and provides code on GitHub, highlighting strong generalization and potential for aiding hematologic diagnosis in clinical settings. Limitations include dependence on dataset size and quality, motivating future work toward larger, more diverse data collections and continued architectural enhancements.

Abstract

In standard hospital blood tests, the traditional process requires doctors to manually isolate leukocytes from microscopic images of patients' blood using microscopes. These isolated leukocytes are then categorized via automatic leukocyte classifiers to determine the proportion and volume of different types of leukocytes present in the blood samples, aiding disease diagnosis. This methodology is not only time-consuming and labor-intensive, but it also has a high propensity for errors due to factors such as image quality and environmental conditions, which could potentially lead to incorrect subsequent classifications and misdiagnosis. To address these issues, this paper proposes an innovative method of leukocyte detection: the Multi-level Feature Fusion and Deformable Self-attention DETR (MFDS-DETR). To tackle the issue of leukocyte scale disparity, we designed the High-level Screening-feature Fusion Pyramid (HS-FPN), enabling multi-level fusion. This model uses high-level features as weights to filter low-level feature information via a channel attention module and then merges the screened information with the high-level features, thus enhancing the model's feature expression capability. Further, we address the issue of leukocyte feature scarcity by incorporating a multi-scale deformable self-attention module in the encoder and using the self-attention and cross-deformable attention mechanisms in the decoder, which aids in the extraction of the global features of the leukocyte feature maps. The effectiveness, superiority, and generalizability of the proposed MFDS-DETR method are confirmed through comparisons with other cutting-edge leukocyte detection models using the private WBCDD, public LISC and BCCD datasets. Our source code and private WBCCD dataset are available at https://github.com/JustlfC03/MFDS-DETR.
Paper Structure (23 sections, 10 equations, 11 figures, 10 tables)

This paper contains 23 sections, 10 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: The overall architecture of MFDS-DETR comprises four parts: Backbone, High-level Screening-feature Pyramid Networks, Encoder and Decoder.
  • Figure 2: The Framework of High-Level Screening-feature Fusion Pyramid Networks comprises two parts: Feature Selection Module and Feature Fusion Module.
  • Figure 3: The Framework of SFF Module. The combination of transposed convolution and bilinear interpolation is used to process high-level features and achieve purposeful feature fusion.
  • Figure 4: The Framewrok of Deformable Self-attention Module comprises two parts: Offset Module and Attention Module.
  • Figure 5: Presentation of blood images in various datasets.
  • ...and 6 more figures