Table of Contents
Fetching ...

Fairness in Machine Learning-based Hand Load Estimation: A Case Study on Load Carriage Tasks

Arafat Rahman, Sol Lim, Seokhyun Chung

TL;DR

Bias in ML-based hand-load estimation can arise from demographic differences in training data, particularly sex, leading to unfair predictions. The study compares conventional ML methods with a Debiasing VAE (DVAE) that disentangles sex-agnostic and sex-specific gait features, enabling fair predictions across sexes even with imbalanced data. DVAE consistently achieves lower MAE and superior fairness metrics (SP, PRD, NRD) than baselines, outperforming a standard VAE and conventional models. This work highlights fairness-aware modeling as essential for equitable ergonomic risk assessments and safer workplace interventions.

Abstract

Predicting external hand load from sensor data is essential for ergonomic exposure assessments, as obtaining this information typically requires direct observation or supplementary data. While machine learning methods have been used to estimate external hand load from worker postures or force exertion data, our findings reveal systematic bias in these predictions due to individual differences such as age and biological sex. To explore this issue, we examined bias in hand load prediction by varying the sex ratio in the training dataset. We found substantial sex disparity in predictive performance, especially when the training dataset is more sex-imbalanced. To address this bias, we developed and evaluated a fair predictive model for hand load estimation that leverages a Variational Autoencoder (VAE) with feature disentanglement. This approach is designed to separate sex-agnostic and sex-specific latent features, minimizing feature overlap. The disentanglement capability enables the model to make predictions based solely on sex-agnostic features of motion patterns, ensuring fair prediction for both biological sexes. Our proposed fair algorithm outperformed conventional machine learning methods (e.g., Random Forests) in both fairness and predictive accuracy, achieving a lower mean absolute error (MAE) difference across male and female sets and improved fairness metrics such as statistical parity (SP) and positive and negative residual differences (PRD and NRD), even when trained on imbalanced sex datasets. These findings emphasize the importance of fairness-aware machine learning algorithms to prevent potential disadvantages in workplace health and safety for certain worker populations.

Fairness in Machine Learning-based Hand Load Estimation: A Case Study on Load Carriage Tasks

TL;DR

Bias in ML-based hand-load estimation can arise from demographic differences in training data, particularly sex, leading to unfair predictions. The study compares conventional ML methods with a Debiasing VAE (DVAE) that disentangles sex-agnostic and sex-specific gait features, enabling fair predictions across sexes even with imbalanced data. DVAE consistently achieves lower MAE and superior fairness metrics (SP, PRD, NRD) than baselines, outperforming a standard VAE and conventional models. This work highlights fairness-aware modeling as essential for equitable ergonomic risk assessments and safer workplace interventions.

Abstract

Predicting external hand load from sensor data is essential for ergonomic exposure assessments, as obtaining this information typically requires direct observation or supplementary data. While machine learning methods have been used to estimate external hand load from worker postures or force exertion data, our findings reveal systematic bias in these predictions due to individual differences such as age and biological sex. To explore this issue, we examined bias in hand load prediction by varying the sex ratio in the training dataset. We found substantial sex disparity in predictive performance, especially when the training dataset is more sex-imbalanced. To address this bias, we developed and evaluated a fair predictive model for hand load estimation that leverages a Variational Autoencoder (VAE) with feature disentanglement. This approach is designed to separate sex-agnostic and sex-specific latent features, minimizing feature overlap. The disentanglement capability enables the model to make predictions based solely on sex-agnostic features of motion patterns, ensuring fair prediction for both biological sexes. Our proposed fair algorithm outperformed conventional machine learning methods (e.g., Random Forests) in both fairness and predictive accuracy, achieving a lower mean absolute error (MAE) difference across male and female sets and improved fairness metrics such as statistical parity (SP) and positive and negative residual differences (PRD and NRD), even when trained on imbalanced sex datasets. These findings emphasize the importance of fairness-aware machine learning algorithms to prevent potential disadvantages in workplace health and safety for certain worker populations.

Paper Structure

This paper contains 27 sections, 6 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of data structure: Gait pattern data collected from 22 participants across three different box weights (4.5, 13.6, and 22.7 kg), labeled by biological sex and box weight.
  • Figure 2: Structure of a Variational Autoencoder (VAE) and its extension for supervised learning. The latent distribution is often modeled as a Gaussian distribution with mean $\mu$ and standard deviation $\sigma$.
  • Figure 3: An overview of the DVAE model that separates the sex-specific and sex-agnostic features. During inference, $\sigma$ (standard deviation) is not used because the model directly utilizes $\mu$ (mean) for deterministic and point predictions, avoiding stochastic sampling.
  • Figure 4: Performances of conventional ML models and new appraoches across varying male-to-female training ratios: Evaluation on male and female test sets, highlighting larger performance differences between groups in conventional MLs and VAE than DVAE. The symbol “*” indicates a significant pairwise difference ($p < 0.05$).
  • Figure 5: Fairness metrics evaluated on both female and male test sets across different male-to-female training ratios. Values closer to the dotted red line indicate greater fairness based on the selected fairness metric. SP = Statistical Parity, PRD = Positive Residual Differences, NRD = Negative Residual Differences.
  • ...and 1 more figures