Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation
Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael Wan
TL;DR
This work tackles the scarcity and reproducibility challenges in video-based infant respiration estimation by introducing AIR-400, a publicly available, annotated dataset of 400 clips from 18 infants. It pairs infant-specific ROI detection with optical-flow augmentation and spatiotemporal networks to produce reproducible respiration waveforms, evaluated with six-fold subject-wise cross-validation and a PSD-based loss. The study shows that larger, carefully curated datasets improve performance but also reveals reproducibility concerns in prior work, especially regarding AIR-125. The contributions lay a robust foundation for benchmarking and advancing contactless infant respiratory monitoring in home and clinical settings.
Abstract
The development of contactless respiration monitoring for infants could enable advances in the early detection and treatment of breathing irregularities, which are associated with neurodevelopmental impairments and conditions like sudden infant death syndrome (SIDS). But while respiration estimation for adults is supported by a robust ecosystem of computer vision algorithms and video datasets, only one small public video dataset with annotated respiration data for infant subjects exists, and there are no reproducible algorithms which are effective for infants. We introduce the annotated infant respiration dataset of 400 videos (AIR-400), contributing 275 new, carefully annotated videos from 10 recruited subjects to the public corpus. We develop the first reproducible pipelines for infant respiration estimation, based on infant-specific region-of-interest detection and spatiotemporal neural processing enhanced by optical flow inputs. We establish, through comprehensive experiments, the first reproducible benchmarks for the state-of-the-art in vision-based infant respiration estimation. We make our dataset, code repository, and trained models available for public use.
