Table of Contents
Fetching ...

Illuminating the Unseen: Investigating the Context-induced Harms in Behavioral Sensing

Han Zhang, Vedant Das Swain, Leijie Wang, Nan Gao, Yilun Sheng, Xuhai Xu, Flora D. Salim, Koustuv Saha, Anind K. Dey, Jennifer Mankoff

TL;DR

The paper identifies three gaps in behavioral sensing harms: limited attention to context-specific identities, neglect of situation-based harms across deployment settings, and insufficient frameworks for iterative bias mitigation and maintenance. It introduces a theory-driven, human-centered evaluation framework that integrates context sensitivity and user engagement with ongoing maintenance. Through two post-hoc studies—depression detection and student engagement prediction—the authors demonstrate both identity- and situation-based harms in real-world data and show how bias-mitigation efforts can shift harms rather than eliminate them. They argue for regular maintenance, broader stakeholder involvement, and expansion of responsible considerations (transparency, privacy, accountability) to enable safer, more equitable behavioral sensing in dynamic, longitudinal deployments.

Abstract

Behavioral sensing technologies are rapidly evolving across a range of well-being applications. Despite its potential, concerns about the responsible use of such technology are escalating. In response, recent research within the sensing technology has started to address these issues. While promising, they primarily focus on broad demographic categories and overlook more nuanced, context-specific identities. These approaches lack grounding within domain-specific harms that arise from deploying sensing technology in diverse social, environmental, and technological settings. Additionally, existing frameworks for evaluating harms are designed for a generic ML life cycle, and fail to adapt to the dynamic and longitudinal considerations for behavioral sensing technology. To address these gaps, we introduce a framework specifically designed for evaluating behavioral sensing technologies. This framework emphasizes a comprehensive understanding of context, particularly the situated identities of users and the deployment settings of the sensing technology. It also highlights the necessity for iterative harm mitigation and continuous maintenance to adapt to the evolving nature of technology and its use. We demonstrate the feasibility and generalizability of our framework through post-hoc evaluations on two real-world behavioral sensing studies conducted in different international contexts, involving varied population demographics and machine learning tasks. Our evaluations provide empirical evidence of both situated identity-based harm and more domain-specific harms, and discuss the trade-offs introduced by implementing bias mitigation techniques.

Illuminating the Unseen: Investigating the Context-induced Harms in Behavioral Sensing

TL;DR

The paper identifies three gaps in behavioral sensing harms: limited attention to context-specific identities, neglect of situation-based harms across deployment settings, and insufficient frameworks for iterative bias mitigation and maintenance. It introduces a theory-driven, human-centered evaluation framework that integrates context sensitivity and user engagement with ongoing maintenance. Through two post-hoc studies—depression detection and student engagement prediction—the authors demonstrate both identity- and situation-based harms in real-world data and show how bias-mitigation efforts can shift harms rather than eliminate them. They argue for regular maintenance, broader stakeholder involvement, and expansion of responsible considerations (transparency, privacy, accountability) to enable safer, more equitable behavioral sensing in dynamic, longitudinal deployments.

Abstract

Behavioral sensing technologies are rapidly evolving across a range of well-being applications. Despite its potential, concerns about the responsible use of such technology are escalating. In response, recent research within the sensing technology has started to address these issues. While promising, they primarily focus on broad demographic categories and overlook more nuanced, context-specific identities. These approaches lack grounding within domain-specific harms that arise from deploying sensing technology in diverse social, environmental, and technological settings. Additionally, existing frameworks for evaluating harms are designed for a generic ML life cycle, and fail to adapt to the dynamic and longitudinal considerations for behavioral sensing technology. To address these gaps, we introduce a framework specifically designed for evaluating behavioral sensing technologies. This framework emphasizes a comprehensive understanding of context, particularly the situated identities of users and the deployment settings of the sensing technology. It also highlights the necessity for iterative harm mitigation and continuous maintenance to adapt to the evolving nature of technology and its use. We demonstrate the feasibility and generalizability of our framework through post-hoc evaluations on two real-world behavioral sensing studies conducted in different international contexts, involving varied population demographics and machine learning tasks. Our evaluations provide empirical evidence of both situated identity-based harm and more domain-specific harms, and discuss the trade-offs introduced by implementing bias mitigation techniques.
Paper Structure (43 sections, 4 figures, 11 tables)

This paper contains 43 sections, 4 figures, 11 tables.

Figures (4)

  • Figure 1: Overview of the Framework for Evaluating and Mitigating Context-induced Harms in Behavioral Sensing. Steps 3, 4, and 5 cover the conventional evaluation flow for identifying harms.
  • Figure 2: Percentage of each group within each sensitive attribute. The protected group for each sensitive attribute (e.g., first-gen) is shaded in dark colors, while the unprotected group is shaded in light colors (e.g., non-first-gen). Non-male includes women, transgender individuals, and genderqueer individuals, non-heterosexual includes homosexual, bisexual, and asexual individuals, and non-white includes black, asian, latinx, and biracial.
  • Figure 3: Example of fairness evaluation based on the disparities in accuracy, false negative rate, and false positive rate. (a) shows the synthetic data for 20 individuals, with 6 belonging to the protected group (represented by "x" marks) and 14 belonging to the unprotected group (represented by "$\boldsymbol{\cdot}$" marks). (b) visualizes the distribution and disparities of predictions for both groups, where correction predictions are depicted in green and incorrect predictions in red.
  • Figure 4: Comparisons of depression (BDI-II) scores for different groups of four datasets. The red dotted line indicates the cutoff point (i.e., 13) for BDI-II scores, which is used to distinguish between students with at least mild depressive symptoms (BDI-II score >=13) and those without (BDI-II < 13). Significance levels after Benjamini-Hochberg (B-H) correction are marked with an asterisk (*$p<0.05$) in red on the subplot. First-gen, BA, and HET represent first-generation college students, bachelor, and heterosexual, respectively.