Table of Contents
Fetching ...

Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement

Aaqib Saeed, Dimitris Spathis, Jungwoo Oh, Edward Choi, Ali Etemad

TL;DR

The paper tackles label noise in wearable time-series for health sensing. It proposes Few-shot Human-in-the-Loop Refinement (FHLR), a three-stage approach that seeds a model with weak (smoothed) labels, refines it with a small set of expert-labeled examples, and merges the seed and refined models via weighted parameter averaging. Across four health-related tasks, FHLR delivers significant improvements over eight baselines, including robust performance under both symmetric and asymmetric label noise, and demonstrates that simple parameter averaging can rival more complex ensembles. The method requires only a small amount of clean data and does not assume specific noise distributions, offering a practical, scalable solution for robust wearable health monitoring.

Abstract

Wearable technologies enable continuous monitoring of various health metrics, such as physical activity, heart rate, sleep, and stress levels. A key challenge with wearable data is obtaining quality labels. Unlike modalities like video where the videos themselves can be effectively used to label objects or events, wearable data do not contain obvious cues about the physical manifestation of the users and usually require rich metadata. As a result, label noise can become an increasingly thorny issue when labeling such data. In this paper, we propose a novel solution to address noisy label learning, entitled Few-Shot Human-in-the-Loop Refinement (FHLR). Our method initially learns a seed model using weak labels. Next, it fine-tunes the seed model using a handful of expert corrections. Finally, it achieves better generalizability and robustness by merging the seed and fine-tuned models via weighted parameter averaging. We evaluate our approach on four challenging tasks and datasets, and compare it against eight competitive baselines designed to deal with noisy labels. We show that FHLR achieves significantly better performance when learning from noisy labels and achieves state-of-the-art by a large margin, with up to 19% accuracy improvement under symmetric and asymmetric noise. Notably, we find that FHLR is particularly robust to increased label noise, unlike prior works that suffer from severe performance degradation. Our work not only achieves better generalization in high-stakes health sensing benchmarks but also sheds light on how noise affects commonly-used models.

Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement

TL;DR

The paper tackles label noise in wearable time-series for health sensing. It proposes Few-shot Human-in-the-Loop Refinement (FHLR), a three-stage approach that seeds a model with weak (smoothed) labels, refines it with a small set of expert-labeled examples, and merges the seed and refined models via weighted parameter averaging. Across four health-related tasks, FHLR delivers significant improvements over eight baselines, including robust performance under both symmetric and asymmetric label noise, and demonstrates that simple parameter averaging can rival more complex ensembles. The method requires only a small amount of clean data and does not assume specific noise distributions, offering a practical, scalable solution for robust wearable health monitoring.

Abstract

Wearable technologies enable continuous monitoring of various health metrics, such as physical activity, heart rate, sleep, and stress levels. A key challenge with wearable data is obtaining quality labels. Unlike modalities like video where the videos themselves can be effectively used to label objects or events, wearable data do not contain obvious cues about the physical manifestation of the users and usually require rich metadata. As a result, label noise can become an increasingly thorny issue when labeling such data. In this paper, we propose a novel solution to address noisy label learning, entitled Few-Shot Human-in-the-Loop Refinement (FHLR). Our method initially learns a seed model using weak labels. Next, it fine-tunes the seed model using a handful of expert corrections. Finally, it achieves better generalizability and robustness by merging the seed and fine-tuned models via weighted parameter averaging. We evaluate our approach on four challenging tasks and datasets, and compare it against eight competitive baselines designed to deal with noisy labels. We show that FHLR achieves significantly better performance when learning from noisy labels and achieves state-of-the-art by a large margin, with up to 19% accuracy improvement under symmetric and asymmetric noise. Notably, we find that FHLR is particularly robust to increased label noise, unlike prior works that suffer from severe performance degradation. Our work not only achieves better generalization in high-stakes health sensing benchmarks but also sheds light on how noise affects commonly-used models.
Paper Structure (12 sections, 1 equation, 4 figures, 5 tables)

This paper contains 12 sections, 1 equation, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Illustration of the proposed noisy label learning framework. Overview of FHLR, a simple yet effective method for dealing with noisy labels by pre-training a model with weak labels, fine-tuning with expert annotations, and performing weight averaging to come up with the final model.
  • Figure 2: Ablation of different label acquisition strategies.
  • Figure 3: Performance improvement with model merging.
  • Figure 4: Ablation of a varying number of shots, i.e., few clean examples.