PyPulse: A Python Library for Biosignal Imputation
Kevin Gao, Maxwell A. Xu, James M. Rehg, Alexander Moreno
TL;DR
Missing data in biosignals from clinical and wearable sensors hampers downstream analysis. The paper presents PyPulse, a modular Python library with YAML-driven configuration, support for custom datasets and missingness mechanisms, and an interactive visualization tool to compare imputations. It implements 11 imputation methods spanning classical and deep-learning approaches (e.g., transformers like BDCTransformer and DeepMVI) and enables end-to-end single-command workflows for training, evaluation, and visualization. The framework emphasizes extensibility and accessibility for non-ML researchers, addressing limitations of prior codebases and accelerating benchmarking in health monitoring and just-in-time interventions.
Abstract
We introduce PyPulse, a Python package for imputation of biosignals in both clinical and wearable sensor settings. Missingness is commonplace in these settings and can arise from multiple causes, such as insecure sensor attachment or data transmission loss. PyPulse's framework provides a modular and extendable framework with high ease-of-use for a broad userbase, including non-machine-learning bioresearchers. Specifically, its new capabilities include using pre-trained imputation methods out-of-the-box on custom datasets, running the full workflow of training or testing a baseline method with a single line of code, and comparing baseline methods in an interactive visualization tool. We released PyPulse under the MIT License on Github and PyPI. The source code can be found at: https://github.com/rehg-lab/pulseimpute.
