Table of Contents
Fetching ...

Farsight: Fostering Responsible AI Awareness During AI Application Prototyping

Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio

TL;DR

Farsight introduces an in situ harm-envisioning tool embedded in AI prototyping environments to help diverse prototypers anticipate downstream harms early. The system combines real-time incident surfacing, LLM-generated use cases, stakeholders, and harms, with an interactive Harm Envisioner that users can edit and extend, all within a model- and environment-agnostic, open-source implementation. A mixed-method evaluation with 42 participants shows that Farsight increases the number of harms users can envision independently and shifts their approach toward end-user-centric and cascading-harm considerations, while maintaining comparable perceived usefulness to existing resources. The work highlights the value and challenges of in situ harm awareness, discusses subjectivity in harm assessment, and proposes directions for integrating mitigation guidance and seamful design in future AI prototyping tools.

Abstract

Prompt-based interfaces for Large Language Models (LLMs) have made prototyping and building AI-powered applications easier than ever before. However, identifying potential harms that may arise from AI applications remains a challenge, particularly during prompt-based prototyping. To address this, we present Farsight, a novel in situ interactive tool that helps people identify potential harms from the AI applications they are prototyping. Based on a user's prompt, Farsight highlights news articles about relevant AI incidents and allows users to explore and edit LLM-generated use cases, stakeholders, and harms. We report design insights from a co-design study with 10 AI prototypers and findings from a user study with 42 AI prototypers. After using Farsight, AI prototypers in our user study are better able to independently identify potential harms associated with a prompt and find our tool more useful and usable than existing resources. Their qualitative feedback also highlights that Farsight encourages them to focus on end-users and think beyond immediate harms. We discuss these findings and reflect on their implications for designing AI prototyping experiences that meaningfully engage with AI harms. Farsight is publicly accessible at: https://PAIR-code.github.io/farsight.

Farsight: Fostering Responsible AI Awareness During AI Application Prototyping

TL;DR

Farsight introduces an in situ harm-envisioning tool embedded in AI prototyping environments to help diverse prototypers anticipate downstream harms early. The system combines real-time incident surfacing, LLM-generated use cases, stakeholders, and harms, with an interactive Harm Envisioner that users can edit and extend, all within a model- and environment-agnostic, open-source implementation. A mixed-method evaluation with 42 participants shows that Farsight increases the number of harms users can envision independently and shifts their approach toward end-user-centric and cascading-harm considerations, while maintaining comparable perceived usefulness to existing resources. The work highlights the value and challenges of in situ harm awareness, discusses subjectivity in harm assessment, and proposes directions for integrating mitigation guidance and seamful design in future AI prototyping tools.

Abstract

Prompt-based interfaces for Large Language Models (LLMs) have made prototyping and building AI-powered applications easier than ever before. However, identifying potential harms that may arise from AI applications remains a challenge, particularly during prompt-based prototyping. To address this, we present Farsight, a novel in situ interactive tool that helps people identify potential harms from the AI applications they are prototyping. Based on a user's prompt, Farsight highlights news articles about relevant AI incidents and allows users to explore and edit LLM-generated use cases, stakeholders, and harms. We report design insights from a co-design study with 10 AI prototypers and findings from a user study with 42 AI prototypers. After using Farsight, AI prototypers in our user study are better able to independently identify potential harms associated with a prompt and find our tool more useful and usable than existing resources. Their qualitative feedback also highlights that Farsight encourages them to focus on end-users and think beyond immediate harms. We discuss these findings and reflect on their implications for designing AI prototyping experiences that meaningfully engage with AI harms. Farsight is publicly accessible at: https://PAIR-code.github.io/farsight.
Paper Structure (59 sections, 18 figures, 3 tables)

This paper contains 59 sections, 18 figures, 3 tables.

Figures (18)

  • Figure 1: (A) Many AI prototypers from diverse backgrounds and roles use (B) prompting tools to prototype AI applications. Farsight provides a range of in situ widgets for these tools, helping AI prototypers envision the potential harms of their AI applications during an early prototyping stage.
  • Figure 2: Farsight fits into AI prototypers' diverse prompting workflows including prompting GUIs and computational notebooks. For example, (A) when an AI prototyper writes prompts for a therapy chatbot in Google AI Studio googleGoogleAiStudio2023, Farsight's Chrome extension alerts the user about related accidents and potential harms. (B) When an AI prototyper writes prompts for a toxicity classifier in Jupyter Notebook kluyverJupyterNotebooksaPublishing2016wangStickyLandBreakingLinear2022a, Farsight's Python library shows potential negative consequences of this classifier.
  • Figure 3: Average ratings on our design ideas from 10 AI prototypers. Features marked with were presented to participants as early-stage prototypes, while other features were presented as sketches (see details in \ref{['fig:appendix-prototypes']}).
  • Figure 4: Three alert modes of the Alert Symbol.
  • Figure 5: The Awareness Sidebar provides in situ information to remind AI prototypers of potential risks. (A) Given a user's current prompt, (B) the Incident Panel shows the (B1) latest and (B2) related AI incident reports sampled from the AI Incident Database mcgregorPreventingRepeatedReal2020. (B2) The related AI incident tab is the default view, which uses text embedding similarities between the user's prompt and all AI incident reports to surface relevant reports. (C) The Use Case Panel leverages LLM to generate potential use cases and harms. Each use case is classified by an LLM and organized into (C1)intended, (C2)high-stakes, and (C3)misuse tabs.
  • ...and 13 more figures