Application of Deep Learning Methods to Processing of Noisy Medical Video Data
Danil Afonchikov, Elena Kornaeva, Irina Makovik, Alexey Kornaev
TL;DR
This paper addresses counting white and red blood cells in noisy medical video data using curriculum learning (CL) and multi-view (MV) post-processing. It introduces a synthetic moving blood cell video dataset (1150 videos, 100 frames each, 3×128×128) to bridge in vitro complete blood count tests with prospective in vivo capillaroscopy data, and formalizes CL with $d = \\alpha b + \\beta l$ and $c(t) = \\min ( 1, \\sqrt[^p]{t \\frac{1 - c_0^p}{T} + c_0^p} )$, while applying MV schemes (MVM and MVWCo-S) over multiple augmented crops. Experiments across WBC and RBC tasks show that MV improves accuracy (best ~$90.3\\%$ for WBC with 9-frame inputs; RBC ~65% with 100-frame sequences) and that MV robustness extends to CIFAR-10 under label noise, where clean-label accuracy reaches ~92% and noisy-label accuracy ~87–88%. The results indicate that CL and MV offer practical, scalable improvements for processing small, high-noise medical video datasets and could enable more reliable in vivo CBC predictions.
Abstract
Cells count become a challenging problem when the cells move in a continuous stream, and their boundaries are difficult for visual detection. To resolve this problem we modified the training and decision making processes using curriculum learning and multi-view predictions techniques, respectively.
