Table of Contents
Fetching ...

FedSCS-XGB -- Federated Server-centric surrogate XGBoost for continual health monitoring

Felix Walger, Mehdi Ejtehadi, Anke Schmeink, Diego Paez-Granados

TL;DR

A novel distributed machine learning protocol for human activity recognition (HAR) from wearable sensor data based on gradient-boosted decision trees (XGBoost) inspired by Party-Adaptive XGBoost (PAX) while explicitly preserving key structural and optimization properties of standard XGBoost.

Abstract

Wearable sensors with local data processing can detect health threats early, enhance documentation, and support personalized therapy. In the context of spinal cord injury (SCI), which involves risks such as pressure injuries and blood pressure instability, continuous monitoring can help mitigate these by enabling early deDtection and intervention. In this work, we present a novel distributed machine learning (DML) protocol for human activity recognition (HAR) from wearable sensor data based on gradient-boosted decision trees (XGBoost). The proposed architecture is inspired by Party-Adaptive XGBoost (PAX) while explicitly preserving key structural and optimization properties of standard XGBoost, including histogram-based split construction and tree-ensemble dynamics. First, we provide a theoretical analysis showing that, under appropriate data conditions and suitable hyperparameter selection, the proposed distributed protocol can converge to solutions equivalent to centralized XGBoost training. Second, the protocol is empirically evaluated on a representative wearable-sensor HAR dataset, reflecting the heterogeneity and data fragmentation typical of remote monitoring scenarios. Benchmarking against centralized XGBoost and IBM PAX demonstrates that the theoretical convergence properties are reflected in practice. The results indicate that the proposed approach can match centralized performance up to a gap under 1\% while retaining the structural advantages of XGBoost in distributed wearable-based HAR settings.

FedSCS-XGB -- Federated Server-centric surrogate XGBoost for continual health monitoring

TL;DR

A novel distributed machine learning protocol for human activity recognition (HAR) from wearable sensor data based on gradient-boosted decision trees (XGBoost) inspired by Party-Adaptive XGBoost (PAX) while explicitly preserving key structural and optimization properties of standard XGBoost.

Abstract

Wearable sensors with local data processing can detect health threats early, enhance documentation, and support personalized therapy. In the context of spinal cord injury (SCI), which involves risks such as pressure injuries and blood pressure instability, continuous monitoring can help mitigate these by enabling early deDtection and intervention. In this work, we present a novel distributed machine learning (DML) protocol for human activity recognition (HAR) from wearable sensor data based on gradient-boosted decision trees (XGBoost). The proposed architecture is inspired by Party-Adaptive XGBoost (PAX) while explicitly preserving key structural and optimization properties of standard XGBoost, including histogram-based split construction and tree-ensemble dynamics. First, we provide a theoretical analysis showing that, under appropriate data conditions and suitable hyperparameter selection, the proposed distributed protocol can converge to solutions equivalent to centralized XGBoost training. Second, the protocol is empirically evaluated on a representative wearable-sensor HAR dataset, reflecting the heterogeneity and data fragmentation typical of remote monitoring scenarios. Benchmarking against centralized XGBoost and IBM PAX demonstrates that the theoretical convergence properties are reflected in practice. The results indicate that the proposed approach can match centralized performance up to a gap under 1\% while retaining the structural advantages of XGBoost in distributed wearable-based HAR settings.
Paper Structure (22 sections, 5 theorems, 14 equations, 4 figures, 2 tables, 2 algorithms)

This paper contains 22 sections, 5 theorems, 14 equations, 4 figures, 2 tables, 2 algorithms.

Key Result

Lemma 1

Fix a boosting round $t$ and sketch edges $\tilde{E}^{(t)}$. Greedy split evaluation using routed atom statistics is exactly equivalent to running histogram XGBoost on the induced atom pseudo-dataset with edges $\tilde{E}^{(t)}$.

Figures (4)

  • Figure 1: A continual health monitoring system with fl.
  • Figure 2: Sensor setup for the SCI wheelchair user: VivaLNK Wearable ECG monitor sensor on chest, two Corsano’s CardioWatch 287-2 worn at wrists, Sensomative wheelchair mat placed under bottom cushion, and Zurichmove IMU on the wheel of the wheelchair.
  • Figure 3: Performance illustration of FedSCS, PAX and XGBoost-baseline for varying bin sizes.
  • Figure 4: Comparison of the performance gap between FedSCS and PAX to XGboost-baseline.

Theorems & Definitions (9)

  • Lemma 1: Exact surrogate equivalence
  • proof
  • Lemma 2: Hessian prefix-mass perturbation
  • proof
  • Lemma 3: Lipschitz continuity of the gain
  • proof
  • Lemma 4: Uniform gain perturbation
  • Theorem 5: Finite-horizon $\varepsilon$-approximation
  • proof