Evaluating Fairness in Self-supervised and Supervised Models for Sequential Data
Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena Vakali, Daniele Quercia, Fahim Kawsar
TL;DR
The paper addresses fairness gaps in time-series healthcare modeling by comparing self-supervised and supervised learning across many models and fine-tuning strategies on the MIMIC-III dataset. It uses a SimCLR-style SSL framework, progressive layer freezing, and CKA to assess learned representations, evaluating with AUC-ROC and the Error Rate Ratio (ERR). The key finding is that SSL can match supervised performance while delivering up to a 27% improvement in fairness, depending on fine-tuning, with representation differences across demographic groups contributing to these effects. This work highlights SSL’s potential for fairer, data-scarce, human-centric applications like critical care monitoring and mortality prediction.
Abstract
Self-supervised learning (SSL) has become the de facto training paradigm of large models where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Hypothesizing that SSL models would learn more generic, hence less biased, representations, this study explores the impact of pre-training and fine-tuning strategies on fairness (i.e., performing equally on different demographic breakdowns). Motivated by human-centric applications on real-world timeseries data, we interpret inductive biases on the model, layer, and metric levels by systematically comparing SSL models to their supervised counterparts. Our findings demonstrate that SSL has the capacity to achieve performance on par with supervised methods while significantly enhancing fairness--exhibiting up to a 27% increase in fairness with a mere 1% loss in performance through self-supervision. Ultimately, this work underscores SSL's potential in human-centric computing, particularly high-stakes, data-scarce application domains like healthcare.
