Table of Contents
Fetching ...

PyPulse: A Python Library for Biosignal Imputation

Kevin Gao, Maxwell A. Xu, James M. Rehg, Alexander Moreno

TL;DR

Missing data in biosignals from clinical and wearable sensors hampers downstream analysis. The paper presents PyPulse, a modular Python library with YAML-driven configuration, support for custom datasets and missingness mechanisms, and an interactive visualization tool to compare imputations. It implements 11 imputation methods spanning classical and deep-learning approaches (e.g., transformers like BDCTransformer and DeepMVI) and enables end-to-end single-command workflows for training, evaluation, and visualization. The framework emphasizes extensibility and accessibility for non-ML researchers, addressing limitations of prior codebases and accelerating benchmarking in health monitoring and just-in-time interventions.

Abstract

We introduce PyPulse, a Python package for imputation of biosignals in both clinical and wearable sensor settings. Missingness is commonplace in these settings and can arise from multiple causes, such as insecure sensor attachment or data transmission loss. PyPulse's framework provides a modular and extendable framework with high ease-of-use for a broad userbase, including non-machine-learning bioresearchers. Specifically, its new capabilities include using pre-trained imputation methods out-of-the-box on custom datasets, running the full workflow of training or testing a baseline method with a single line of code, and comparing baseline methods in an interactive visualization tool. We released PyPulse under the MIT License on Github and PyPI. The source code can be found at: https://github.com/rehg-lab/pulseimpute.

PyPulse: A Python Library for Biosignal Imputation

TL;DR

Missing data in biosignals from clinical and wearable sensors hampers downstream analysis. The paper presents PyPulse, a modular Python library with YAML-driven configuration, support for custom datasets and missingness mechanisms, and an interactive visualization tool to compare imputations. It implements 11 imputation methods spanning classical and deep-learning approaches (e.g., transformers like BDCTransformer and DeepMVI) and enables end-to-end single-command workflows for training, evaluation, and visualization. The framework emphasizes extensibility and accessibility for non-ML researchers, addressing limitations of prior codebases and accelerating benchmarking in health monitoring and just-in-time interventions.

Abstract

We introduce PyPulse, a Python package for imputation of biosignals in both clinical and wearable sensor settings. Missingness is commonplace in these settings and can arise from multiple causes, such as insecure sensor attachment or data transmission loss. PyPulse's framework provides a modular and extendable framework with high ease-of-use for a broad userbase, including non-machine-learning bioresearchers. Specifically, its new capabilities include using pre-trained imputation methods out-of-the-box on custom datasets, running the full workflow of training or testing a baseline method with a single line of code, and comparing baseline methods in an interactive visualization tool. We released PyPulse under the MIT License on Github and PyPI. The source code can be found at: https://github.com/rehg-lab/pulseimpute.

Paper Structure

This paper contains 4 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: a) PyPulse framework where blue are the modules, green are the classes, and orange is I/O. b) Example experiment config demonstrating model, data, and train inputs.
  • Figure 2: Visualization demo demonstrating the imputation results of BDC trasformer, Vanilla Transformer, DeepMVI, and Linear Interpolation, compared to the ground truth. The left shows the inputs for the interactive visualization and on the right is the plot.