Table of Contents
Fetching ...

Active Learning for WBAN-based Health Monitoring

Cho-Chun Chiu, Tuan Nguyen, Ting He, Shiqiang Wang, Beom-Su Kim, Ki-Il Kim

TL;DR

This work addresses learning health-state classifiers from WBAN data under costly unlabeled samples and non-real-time labeling. It develops a two-phase approach that uses online predictive coreset construction to selectively collect unlabeled data based on noisy forecasts, followed by offline labeling to train the target model with theoretical guarantees. A generalized error bound adapts coreset theory to non-iid, nonzero-loss WBAN data, and a practical predictive coreset algorithm provides performance guarantees while reducing data curation costs. Empirical results on public health data and a real prototype show substantial cost savings (often >90%) with minimal loss in model quality, highlighting a path toward energy-efficient, personalized health monitoring in resource-constrained WBAN deployments.

Abstract

We consider a novel active learning problem motivated by the need of learning machine learning models for health monitoring in wireless body area network (WBAN). Due to the limited resources at body sensors, collecting each unlabeled sample in WBAN incurs a nontrivial cost. Moreover, training health monitoring models typically requires labels indicating the patient's health state that need to be generated by healthcare professionals, which cannot be obtained at the same pace as data collection. These challenges make our problem fundamentally different from classical active learning, where unlabeled samples are free and labels can be queried in real time. To handle these challenges, we propose a two-phased active learning method, consisting of an online phase where a coreset construction algorithm is proposed to select a subset of unlabeled samples based on their noisy predictions, and an offline phase where the selected samples are labeled to train the target model. The samples selected by our algorithm are proved to yield a guaranteed error in approximating the full dataset in evaluating the loss function. Our evaluation based on real health monitoring data and our own experimentation demonstrates that our solution can drastically save the data curation cost without sacrificing the quality of the target model.

Active Learning for WBAN-based Health Monitoring

TL;DR

This work addresses learning health-state classifiers from WBAN data under costly unlabeled samples and non-real-time labeling. It develops a two-phase approach that uses online predictive coreset construction to selectively collect unlabeled data based on noisy forecasts, followed by offline labeling to train the target model with theoretical guarantees. A generalized error bound adapts coreset theory to non-iid, nonzero-loss WBAN data, and a practical predictive coreset algorithm provides performance guarantees while reducing data curation costs. Empirical results on public health data and a real prototype show substantial cost savings (often >90%) with minimal loss in model quality, highlighting a path toward energy-efficient, personalized health monitoring in resource-constrained WBAN deployments.

Abstract

We consider a novel active learning problem motivated by the need of learning machine learning models for health monitoring in wireless body area network (WBAN). Due to the limited resources at body sensors, collecting each unlabeled sample in WBAN incurs a nontrivial cost. Moreover, training health monitoring models typically requires labels indicating the patient's health state that need to be generated by healthcare professionals, which cannot be obtained at the same pace as data collection. These challenges make our problem fundamentally different from classical active learning, where unlabeled samples are free and labels can be queried in real time. To handle these challenges, we propose a two-phased active learning method, consisting of an online phase where a coreset construction algorithm is proposed to select a subset of unlabeled samples based on their noisy predictions, and an offline phase where the selected samples are labeled to train the target model. The samples selected by our algorithm are proved to yield a guaranteed error in approximating the full dataset in evaluating the loss function. Our evaluation based on real health monitoring data and our own experimentation demonstrates that our solution can drastically save the data curation cost without sacrificing the quality of the target model.
Paper Structure (28 sections, 4 theorems, 29 equations, 21 figures, 1 algorithm)

This paper contains 28 sections, 4 theorems, 29 equations, 21 figures, 1 algorithm.

Key Result

Theorem 3.1

Given $n$ unlabeled samples $P:=\{\bm{x}_i\}_{i=1}^n$ and a corresponding weighted set $S$ withThroughout this paper, $\|\bm{x}\|$ denotes the $\ell$-2 norm of vector $\bm{x}$.$\|\bm{x}_i-\bm{x}_{j_i}\|\leq \delta,\: \forall \bm{x}_i\in P$, if the labels $y_1,\ldots,y_n$ of $P$ are conditionally ind for any given $\bm{w}\in \mathcal{W}$.

Figures (21)

  • Figure 1: Workflow for active learning in WBAN.
  • Figure 2: Results of time series forecasting on public data.
  • Figure 3: Distribution of selected samples in Case 1.
  • Figure 4: Distribution of selected samples in Case 2.
  • Figure 5: Distribution of selected samples in Case 3.
  • ...and 16 more figures

Theorems & Definitions (7)

  • Theorem 3.1
  • Theorem 4.1
  • Corollary 4.2
  • Lemma A.1
  • proof
  • proof : Proof of Theorem \ref{['thm:approx error bound']}
  • proof : Proof of Theorem \ref{['thm:delta-cover guarantee']}