Table of Contents
Fetching ...

Investigating Agency of LLMs in Human-AI Collaboration Tasks

Ashish Sharma, Sudha Rao, Chris Brockett, Akanksha Malhotra, Nebojsa Jojic, Bill Dolan

TL;DR

This work defines and operationalizes Agency for LLMs in human–AI collaboration using Bandura's social‑cognitive theory, decomposing Agency into Intentionality, Motivation, Self‑Efficacy, and Self‑Regulation. It builds a collaborative interior‑design testbed and a new human–human dataset of 83 conversations with 908 Agency‑annotated snippets to study how agentive dialogue affects outcomes, alongside a second task for generating agentive dialogue. By presenting Task 1 (measuring Agency in dialogue) and Task 2 (generating agentive dialogue) and evaluating multiple LLMs and prompting/finetuning strategies, the paper demonstrates that stronger Agency features correlate with higher perceived agency and task satisfaction, and that demonstrations of Agency can boost model performance. The work provides benchmarks, baselines, and methodological tools for creating controllable, agentive language models in creative collaboration while highlighting ethical considerations and domain limitations, and it offers data and code to advance this area of research.

Abstract

Agency, the capacity to proactively shape events, is central to how humans interact and collaborate. While LLMs are being developed to simulate human behavior and serve as human-like agents, little attention has been given to the Agency that these models should possess in order to proactively manage the direction of interaction and collaboration. In this paper, we investigate Agency as a desirable function of LLMs, and how it can be measured and managed. We build on social-cognitive theory to develop a framework of features through which Agency is expressed in dialogue - indicating what you intend to do (Intentionality), motivating your intentions (Motivation), having self-belief in intentions (Self-Efficacy), and being able to self-adjust (Self-Regulation). We collect a new dataset of 83 human-human collaborative interior design conversations containing 908 conversational snippets annotated for Agency features. Using this dataset, we develop methods for measuring Agency of LLMs. Automatic and human evaluations show that models that manifest features associated with high Intentionality, Motivation, Self-Efficacy, and Self-Regulation are more likely to be perceived as strongly agentive.

Investigating Agency of LLMs in Human-AI Collaboration Tasks

TL;DR

This work defines and operationalizes Agency for LLMs in human–AI collaboration using Bandura's social‑cognitive theory, decomposing Agency into Intentionality, Motivation, Self‑Efficacy, and Self‑Regulation. It builds a collaborative interior‑design testbed and a new human–human dataset of 83 conversations with 908 Agency‑annotated snippets to study how agentive dialogue affects outcomes, alongside a second task for generating agentive dialogue. By presenting Task 1 (measuring Agency in dialogue) and Task 2 (generating agentive dialogue) and evaluating multiple LLMs and prompting/finetuning strategies, the paper demonstrates that stronger Agency features correlate with higher perceived agency and task satisfaction, and that demonstrations of Agency can boost model performance. The work provides benchmarks, baselines, and methodological tools for creating controllable, agentive language models in creative collaboration while highlighting ethical considerations and domain limitations, and it offers data and code to advance this area of research.

Abstract

Agency, the capacity to proactively shape events, is central to how humans interact and collaborate. While LLMs are being developed to simulate human behavior and serve as human-like agents, little attention has been given to the Agency that these models should possess in order to proactively manage the direction of interaction and collaboration. In this paper, we investigate Agency as a desirable function of LLMs, and how it can be measured and managed. We build on social-cognitive theory to develop a framework of features through which Agency is expressed in dialogue - indicating what you intend to do (Intentionality), motivating your intentions (Motivation), having self-belief in intentions (Self-Efficacy), and being able to self-adjust (Self-Regulation). We collect a new dataset of 83 human-human collaborative interior design conversations containing 908 conversational snippets annotated for Agency features. Using this dataset, we develop methods for measuring Agency of LLMs. Automatic and human evaluations show that models that manifest features associated with high Intentionality, Motivation, Self-Efficacy, and Self-Regulation are more likely to be perceived as strongly agentive.
Paper Structure (35 sections, 10 figures, 6 tables)

This paper contains 35 sections, 10 figures, 6 tables.

Figures (10)

  • Figure 1: We investigate how Agency of LLMs can be measured and controlled. Based on social-cognitive theory, we assess features through which Agency may be expressed -- an LLM may indicate preferences (Intentionality), may motivate them with evidence (Motivation), may have self-belief (Self-Efficacy), and may be able to self-adjust its behavior (Self-Regulation).
  • Figure 2: Overview of our data collection approach. (a) We start by collecting human-human conversations b/w interior designers. (b) We divide each conversation into snippets related to different chair features. (c) Finally, we collect annotations of Agency and its features on each conversational snippet.
  • Figure 3: The relationship between Agency and its features. (a) Designers with High Agency expressed strong Intentionality 26.5% more times than designers with Low Agency; (b) Designers with High Agency expressed strong motivation in support of their design preference 15.2% more times; (c), (d) Expression of strong Self-Efficacy and strong Self-Regulation was related with design elements that were influenced in collaboration.
  • Figure 4: The relationship between linguistic attributes and Agency. Designers who were more tentative had lower agency. On the other hand, designers who were more focused on self, expressed more reasoning strength, and were more persuasive had higher agency.
  • Figure 5: Human Evaluation Results.
  • ...and 5 more figures