Table of Contents
Fetching ...

Behavior Matters: An Alternative Perspective on Promoting Responsible Data Science

Ziwei Dong, Ameya Patil, Yuichi Shoda, Leilani Battle, Emily Wall

TL;DR

This vision paper presents example data science interventions in machine learning and visual data analysis, contextualized in behavior change theories that could be implemented to interrupt and redirect potentially suboptimal or negligent practices while reinforcing ethically conscious behaviors.

Abstract

Data science pipelines inform and influence many daily decisions, from what we buy to who we work for and even where we live. When designed incorrectly, these pipelines can easily propagate social inequity and harm. Traditional solutions are technical in nature; e.g., mitigating biased algorithms. In this vision paper, we introduce a novel lens for promoting responsible data science using theories of behavior change that emphasize not only technical solutions but also the behavioral responsibility of practitioners. By integrating behavior change theories from cognitive psychology with data science workflow knowledge and ethics guidelines, we present a new perspective on responsible data science. We present example data science interventions in machine learning and visual data analysis, contextualized in behavior change theories that could be implemented to interrupt and redirect potentially suboptimal or negligent practices while reinforcing ethically conscious behaviors. We conclude with a call to action to our community to explore this new research area of behavior change interventions for responsible data science.

Behavior Matters: An Alternative Perspective on Promoting Responsible Data Science

TL;DR

This vision paper presents example data science interventions in machine learning and visual data analysis, contextualized in behavior change theories that could be implemented to interrupt and redirect potentially suboptimal or negligent practices while reinforcing ethically conscious behaviors.

Abstract

Data science pipelines inform and influence many daily decisions, from what we buy to who we work for and even where we live. When designed incorrectly, these pipelines can easily propagate social inequity and harm. Traditional solutions are technical in nature; e.g., mitigating biased algorithms. In this vision paper, we introduce a novel lens for promoting responsible data science using theories of behavior change that emphasize not only technical solutions but also the behavioral responsibility of practitioners. By integrating behavior change theories from cognitive psychology with data science workflow knowledge and ethics guidelines, we present a new perspective on responsible data science. We present example data science interventions in machine learning and visual data analysis, contextualized in behavior change theories that could be implemented to interrupt and redirect potentially suboptimal or negligent practices while reinforcing ethically conscious behaviors. We conclude with a call to action to our community to explore this new research area of behavior change interventions for responsible data science.

Paper Structure

This paper contains 23 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: We characterize data science practices according to desired outcomes (rows -- satisfactory and responsible) and agents (columns -- technical and human). It is important to note that outcomes are not mutually exclusive. Rigorous data science has historically emphasized technical aspects like auto-tuning and measures of model accuracy (A, green cell). Recent efforts towards model fairness have illustrated responsible data science, but still ultimately rely on technical indicators and algorithmic solutions (B). In this paper, we emphasize the agency of humans (C and D, right-hand column), and in particular, how human behaviors can contribute to responsible data science (D, red cell).
  • Figure 2: Drawing analogies from behavior change solutions in the clinical domain (green) to the data science domain (blue). Each column represents a behavior change domain. The rows characterize the behavior change problem and solutions, starting with the domain context. The next row characterizes exemplary theories of behavior change, followed by Agents and Desired Outcomes, and how together these might inform a specific intervention in each domain context (final row). The agents and outcomes, characterized as technically satisfactory or behaviorally responsible, are described further in Figure \ref{['fig:teaser']} and Section \ref{['sec:outcomes_agents']}. We hand-pick these limited examples for the sake of space and to demonstrate how behavior change theory can be applied across different domains to bring about the desired outcome through the agent in a generalizable way.
  • Figure 3: As data scientists start analyzing the loan approval dataset within a Jupyter notebook, this intervention (a) reinforces their motivation to practice responsible data science by sharing a real-life story that highlights the potential harm that model outcomes can inflict on disadvantaged groups, aiming to evoke their empathy; (b) follows-up with a goal-priming hint to emphasize the importance of behaving in an unbiased way towards vulnerable sub-groups that are influenced by the model’s outcome.
  • Figure 4: A data visualization showing which nations are major CO2 emitters, and which nations are vulnerable to the effects of these emissions. In its current state, this visualization might only help global policymakers like the Intergovernmental Panel on Climate Change (IPCC). By gathering feedback from viewer groups of different backgrounds like politicians, farmers, and students, this visualization could be made more effective by additionally visualizing how each group contributes to these emissions and how they could help alleviate the problem. Credits: https://onlinepublichealth.gwu.edu/resources/climate-change-emissions-data/