Table of Contents
Fetching ...

Exploiting Boosting in Hyperdimensional Computing for Enhanced Reliability in Healthcare

SungHeon Jeong, Hamza Errahmouni Barkam, Sanggeon Yun, Yeseong Kim, Shaahin Angizi, Mohsen Imani

TL;DR

The paper addresses reliability and robustness in healthcare analytics using hyperdimensional computing (HDC), where underutilization of the high-dimensional space can lead to overfitting in data-constrained settings. It introduces BoostHD, which partitions the total dimension $D$ into $n$ subspaces and trains a sequence of weak learners that are boosted into a strong ensemble, building on OnlineHD. Empirical results on healthcare datasets, including the WESAD benchmark, show BoostHD achieving up to $98.37\%$ accuracy and maintaining stability under noise and data imbalance, with person-specific average accuracy around $96.19\%$. This work demonstrates that efficient subspace utilization and boosting can extend HDC's reliability and practicality for critical applications such as wearable healthcare analytics.

Abstract

Hyperdimensional computing (HDC) enables efficient data encoding and processing in high-dimensional space, benefiting machine learning and data analysis. However, underutilization of these spaces can lead to overfitting and reduced model reliability, especially in data-limited systems a critical issue in sectors like healthcare that demand robustness and consistent performance. We introduce BoostHD, an approach that applies boosting algorithms to partition the hyperdimensional space into subspaces, creating an ensemble of weak learners. By integrating boosting with HDC, BoostHD enhances performance and reliability beyond existing HDC methods. Our analysis highlights the importance of efficient utilization of hyperdimensional spaces for improved model performance. Experiments on healthcare datasets show that BoostHD outperforms state-of-the-art methods. On the WESAD dataset, it achieved an accuracy of 98.37%, surpassing Random Forest, XGBoost, and OnlineHD. BoostHD also demonstrated superior inference efficiency and stability, maintaining high accuracy under data imbalance and noise. In person-specific evaluations, it achieved an average accuracy of 96.19%, outperforming other models. By addressing the limitations of both boosting and HDC, BoostHD expands the applicability of HDC in critical domains where reliability and precision are paramount.

Exploiting Boosting in Hyperdimensional Computing for Enhanced Reliability in Healthcare

TL;DR

The paper addresses reliability and robustness in healthcare analytics using hyperdimensional computing (HDC), where underutilization of the high-dimensional space can lead to overfitting in data-constrained settings. It introduces BoostHD, which partitions the total dimension into subspaces and trains a sequence of weak learners that are boosted into a strong ensemble, building on OnlineHD. Empirical results on healthcare datasets, including the WESAD benchmark, show BoostHD achieving up to accuracy and maintaining stability under noise and data imbalance, with person-specific average accuracy around . This work demonstrates that efficient subspace utilization and boosting can extend HDC's reliability and practicality for critical applications such as wearable healthcare analytics.

Abstract

Hyperdimensional computing (HDC) enables efficient data encoding and processing in high-dimensional space, benefiting machine learning and data analysis. However, underutilization of these spaces can lead to overfitting and reduced model reliability, especially in data-limited systems a critical issue in sectors like healthcare that demand robustness and consistent performance. We introduce BoostHD, an approach that applies boosting algorithms to partition the hyperdimensional space into subspaces, creating an ensemble of weak learners. By integrating boosting with HDC, BoostHD enhances performance and reliability beyond existing HDC methods. Our analysis highlights the importance of efficient utilization of hyperdimensional spaces for improved model performance. Experiments on healthcare datasets show that BoostHD outperforms state-of-the-art methods. On the WESAD dataset, it achieved an accuracy of 98.37%, surpassing Random Forest, XGBoost, and OnlineHD. BoostHD also demonstrated superior inference efficiency and stability, maintaining high accuracy under data imbalance and noise. In person-specific evaluations, it achieved an average accuracy of 96.19%, outperforming other models. By addressing the limitations of both boosting and HDC, BoostHD expands the applicability of HDC in critical domains where reliability and precision are paramount.

Paper Structure

This paper contains 13 sections, 5 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustration of the BoostHD framework applied to hyperdimensional computing (HDC). Sensor information is encoded into a high-dimensional vector space (Dimension $D$). The query vector $Q$ is bundled with multiple contextual vectors $C_1, C_2, \dots, C_n$, forming weak learners with segments of the high-dimensional space. Each weak learner receives a partitioned subspace ($D/n$) of the original hyperdimensional space, optimizing the use of the entire dimensional space and minimizing overfitting. Query weights $W_{Q1}, W_{Q2}, \dots, W_{Qn}$ and model importances are dynamically adjusted based on model error rates, with a boosting approach used to aggregate and adjust the ensemble performance, ensuring robustness and stability, particularly in noise-sensitive domains such as healthcare.
  • Figure 2: Extreme distribution of terms, Eq. \ref{['T1']}, \ref{['T2']}, \ref{['T3']} in $\sigma_{\lambda}^2$
  • Figure 3: Accuracy heatmap based on $N_L$ and their respective $D$. In (a) and (b), $N_L$ takes values from 1 to 100 and 10 to 100 with each step 1, 10. For (a), the accuracy is presented for each specified dimension. For (b), the total dimension($D_{\text{total}}$) is divided among the $N_L$, where each learner possesses a dimension size of $D_{\text{total}} / N_L$.
  • Figure 4: In the process of kernel transformation. Data is mapped into a hyperdimensional space. (a) illustrates the distribution of the raw data has a biased distribution. (b) represents a scenario where $N_c=4000$, while (c) corresponds to $N_c = 400$. From the perspective of span utilization, the mapping illustrated in (c) demonstrates superior efficiency compared to that in (b).
  • Figure 5: Example of BoostHD and OnlineHD $SP$. The orange one is BoostHD's class hypervectors, and the blue one is OnlineHD's class hypervector. BoostHD uses much more space at hyperdimensional space by composing high cosine similarity across class hypervectors.
  • ...and 3 more figures