Table of Contents
Fetching ...

Task-Aware Delegation Cues for LLM Agents

Xingrui Gu

TL;DR

This framework reframes delegation from an opaque system default into a visible, negotiable, and auditable collaborative decision, providing a principled design space for adaptive human--agent collaboration grounded in mutual awareness and shared accountability.

Abstract

LLM agents increasingly present as conversational collaborators, yet human--agent teamwork remains brittle due to information asymmetry: users lack task-specific reliability cues, and agents rarely surface calibrated uncertainty or rationale. We propose a task-aware collaboration signaling layer that turns offline preference evaluations into online, user-facing primitives for delegation. Using Chatbot Arena pairwise comparisons, we induce an interpretable task taxonomy via semantic clustering, then derive (i) Capability Profiles as task-conditioned win-rate maps and (ii) Coordination-Risk Cues as task-conditioned disagreement (tie-rate) priors. These signals drive a closed-loop delegation protocol that supports common-ground verification, adaptive routing (primary vs.\ primary+auditor), explicit rationale disclosure, and privacy-preserving accountability logs. Two predictive probes validate that task typing carries actionable structure: cluster features improve winner prediction accuracy and reduce difficulty prediction error under stratified 5-fold cross-validation. Overall, our framework reframes delegation from an opaque system default into a visible, negotiable, and auditable collaborative decision, providing a principled design space for adaptive human--agent collaboration grounded in mutual awareness and shared accountability.

Task-Aware Delegation Cues for LLM Agents

TL;DR

This framework reframes delegation from an opaque system default into a visible, negotiable, and auditable collaborative decision, providing a principled design space for adaptive human--agent collaboration grounded in mutual awareness and shared accountability.

Abstract

LLM agents increasingly present as conversational collaborators, yet human--agent teamwork remains brittle due to information asymmetry: users lack task-specific reliability cues, and agents rarely surface calibrated uncertainty or rationale. We propose a task-aware collaboration signaling layer that turns offline preference evaluations into online, user-facing primitives for delegation. Using Chatbot Arena pairwise comparisons, we induce an interpretable task taxonomy via semantic clustering, then derive (i) Capability Profiles as task-conditioned win-rate maps and (ii) Coordination-Risk Cues as task-conditioned disagreement (tie-rate) priors. These signals drive a closed-loop delegation protocol that supports common-ground verification, adaptive routing (primary vs.\ primary+auditor), explicit rationale disclosure, and privacy-preserving accountability logs. Two predictive probes validate that task typing carries actionable structure: cluster features improve winner prediction accuracy and reduce difficulty prediction error under stratified 5-fold cross-validation. Overall, our framework reframes delegation from an opaque system default into a visible, negotiable, and auditable collaborative decision, providing a principled design space for adaptive human--agent collaboration grounded in mutual awareness and shared accountability.
Paper Structure (29 sections, 11 equations, 6 figures, 1 table, 3 algorithms)

This paper contains 29 sections, 11 equations, 6 figures, 1 table, 3 algorithms.

Figures (6)

  • Figure 1: The Task-Aware Delegation & Awareness Loop. Our framework operationalizes (1) Intent Recognition via semantic clustering, (2) Dynamic Delegation based on capability profiles, (3) Awareness Cues for trust calibration, and (4) Accountability Logging for error recovery.
  • Figure 2: Uncertainty / Coordination Risk by Task Type. We report a task-type-level proxy for uncertainty (e.g., tie rate / disagreement-derived hardness) aggregated within each cluster. Higher values indicate stronger model disagreement and thus elevated coordination risk, motivating safeguards such as clarification and auditing (Sec. \ref{['sec:protocol']}).
  • Figure 3: Task space visualization. Low-dimensional projection of prompt embeddings, colored by cluster assignment ($K=30$), illustrating the induced task-typing structure used throughout Sec. \ref{['sec:task_typing']}.
  • Figure 4: Interpretable task labels via cluster keywords. Representative topic words for two clusters, used to assign human-readable labels and to support common-ground negotiation in task typing (Sec. \ref{['sec:task_typing']}).
  • Figure 5: Overall preference win rate across tasks. Aggregate win rates across all prompts. We treat this as a global baseline; our focus is the task-conditioned profiles in Fig. \ref{['fig:capability_map']}.
  • ...and 1 more figures