Table of Contents
Fetching ...

Recourse for reclamation: Chatting with generative language models

Jennifer Chien, Kevin R. McKee, Jackie Kay, William Isaac

TL;DR

This paper tackles how static toxicity scoring can suppress information and hinder language reclamation in generative language models. It proposes a dynamic recourse mechanism that lets users set toxicity tolerances via two thresholds, $h^\ast$ and $h^{\mathrm{max}}$, thereby granting real-time user agency over model outputs. A pilot within-subject study ($n=27$ after excluding inattentive participants from an initial $n=30$) comparing fixed-threshold filtering with dynamic recourse shows higher System Usability Scale scores in the recourse condition and a majority of participants chose to adjust future filtering, though perceived controllability varied. Qualitative analyses identify themes around understanding toxicity scoring, perceived control, and biases in toxicity detection, highlighting both the potential for interactive alignment and the need for broader evaluation across diverse user groups. Overall, the work demonstrates that incorporating user-driven recourse into GLM interactions can empower users and improve usability, while underscoring limitations related to sample representativeness and the complexity of real-time alignment.

Abstract

Researchers and developers increasingly rely on toxicity scoring to moderate generative language model outputs, in settings such as customer service, information retrieval, and content generation. However, toxicity scoring may render pertinent information inaccessible, rigidify or "value-lock" cultural norms, and prevent language reclamation processes, particularly for marginalized people. In this work, we extend the concept of algorithmic recourse to generative language models: we provide users a novel mechanism to achieve their desired prediction by dynamically setting thresholds for toxicity filtering. Users thereby exercise increased agency relative to interactions with the baseline system. A pilot study ($n = 30$) supports the potential of our proposed recourse mechanism, indicating improvements in usability compared to fixed-threshold toxicity-filtering of model outputs. Future work should explore the intersection of toxicity scoring, model controllability, user agency, and language reclamation processes -- particularly with regard to the bias that many communities encounter when interacting with generative language models.

Recourse for reclamation: Chatting with generative language models

TL;DR

This paper tackles how static toxicity scoring can suppress information and hinder language reclamation in generative language models. It proposes a dynamic recourse mechanism that lets users set toxicity tolerances via two thresholds, and , thereby granting real-time user agency over model outputs. A pilot within-subject study ( after excluding inattentive participants from an initial ) comparing fixed-threshold filtering with dynamic recourse shows higher System Usability Scale scores in the recourse condition and a majority of participants chose to adjust future filtering, though perceived controllability varied. Qualitative analyses identify themes around understanding toxicity scoring, perceived control, and biases in toxicity detection, highlighting both the potential for interactive alignment and the need for broader evaluation across diverse user groups. Overall, the work demonstrates that incorporating user-driven recourse into GLM interactions can empower users and improve usability, while underscoring limitations related to sample representativeness and the complexity of real-time alignment.

Abstract

Researchers and developers increasingly rely on toxicity scoring to moderate generative language model outputs, in settings such as customer service, information retrieval, and content generation. However, toxicity scoring may render pertinent information inaccessible, rigidify or "value-lock" cultural norms, and prevent language reclamation processes, particularly for marginalized people. In this work, we extend the concept of algorithmic recourse to generative language models: we provide users a novel mechanism to achieve their desired prediction by dynamically setting thresholds for toxicity filtering. Users thereby exercise increased agency relative to interactions with the baseline system. A pilot study () supports the potential of our proposed recourse mechanism, indicating improvements in usability compared to fixed-threshold toxicity-filtering of model outputs. Future work should explore the intersection of toxicity scoring, model controllability, user agency, and language reclamation processes -- particularly with regard to the bias that many communities encounter when interacting with generative language models.
Paper Structure (34 sections, 4 figures, 4 tables, 2 algorithms)

This paper contains 34 sections, 4 figures, 4 tables, 2 algorithms.

Figures (4)

  • Figure 1: Example distribution for toxicity scoring.$h^\ast$ denotes the brand risk threshold for toxicity scoring. All content $c_i$ with a toxicity score $H(c_i) < h^\ast$ is shown to the user, whereas content with a score $H(c_i) \geq h^\ast$ is filtered out: users receive a default message instead. Content where $h^\ast < H(c_i)\leq h^{\mathrm{max}}$ may be permissible, but by default is filtered out. Our proposed recourse mechanism affords user agency over this region of content, resulting in small regions of permissible content between the $h^\ast$ and $h^{\mathrm{max}}$ thresholds.
  • Figure 2: Variation in system usability by condition. GLM interactions in the recourse condition produced higher usability ratings. For this visualization, we classify System Usability Scale scores into categories per brooke1996sus.
  • Figure 3: Additional variation in GLM usability by condition. Algorithmic recourse showed promise in terms of improving GLM usability. Pilot participants agreed more with these usability statements in the recourse condition than in the control condition. Higher numbers indicated stronger agreement.
  • Figure 4: Variation in perceived controllability by condition. Participants expressed difficulty attempting to modify GLM outputs, suggesting the need for further research on recourse mechanisms and user agency in interactions with GLMs. Pilot participants agreed less with these controllability statements in the recourse condition than in the control condition. Higher numbers indicated stronger agreement.