Wink: Recovering from Misbehaviors in Coding Agents

Rahul Nanda; Chandra Maddila; Smriti Jha; Euna Mehnaz Khan; Matteo Paltenghi; Satish Chandra

Wink: Recovering from Misbehaviors in Coding Agents

Rahul Nanda, Chandra Maddila, Smriti Jha, Euna Mehnaz Khan, Matteo Paltenghi, Satish Chandra

TL;DR

This paper tackles the problem of misbehavior in autonomous coding agents powered by large language models by proposing Wink, an asynchronous self-intervention system that detects and corrects misbehaviors at production scale. It introduces a production-grounded taxonomy—Specification Drift, Reasoning Problems, and Tool Call Failures—and demonstrates that a lightweight observer can deliver targeted course-corrections to recover most trajectories with minimal latency impact. Empirical results from over 10,000 trajectories show high single-intervention recovery (around 90%), with substantial reductions in tool-call failures, token usage, and engineer interventions in live A/B tests. The work provides practical insights into deploying resilient agentic systems at scale and outlines directions for more sophisticated, hierarchical intervention strategies.

Abstract

Autonomous coding agents, powered by large language models (LLMs), are increasingly being adopted in the software industry to automate complex engineering tasks. However, these agents are prone to a wide range of misbehaviors, such as deviating from the user's instructions, getting stuck in repetitive loops, or failing to use tools correctly. These failures disrupt the development workflow and often require resource-intensive manual intervention. In this paper, we present a system for automatically recovering from agentic misbehaviors at scale. We first introduce a taxonomy of misbehaviors grounded in an analysis of production traffic, identifying three primary categories: Specification Drift, Reasoning Problems, and Tool Call Failures, which we find occur in about 30% of all agent trajectories. To address these issues, we developed a lightweight, asynchronous self-intervention system named Wink. Wink observes agent trajectories and provides targeted course-correction guidance to nudge the agent back to a productive path. We evaluated our system on over 10,000 real world agent trajectories and found that it successfully resolves 90% of the misbehaviors that require a single intervention. Furthermore, a live A/B test in our production environment demonstrated that our system leads to a statistically significant reduction in Tool Call Failures, Tokens per Session and Engineer Interventions per Session. We present our experience designing and deploying this system, offering insights into the challenges of building resilient agentic systems at scale.

Wink: Recovering from Misbehaviors in Coding Agents

TL;DR

Abstract

Paper Structure (29 sections, 5 equations, 5 figures, 6 tables)

This paper contains 29 sections, 5 equations, 5 figures, 6 tables.

Introduction
Misbehaviors in Software Engineering Agents
Common Misbehaviors in Our Setting
Specification Drift (SD)
Reasoning Problems
Tool Call Failures
Methodology for Misbehavior Prevalence Calculation
Classifiers for prevalence tracking
Offline dataset
Misbehavior prevalence
Self intervention
Reflection and Guidance Generation
Course Correction
System architecture
Experiment setup
...and 14 more sections

Figures (5)

Figure 1: Self-intervention mechanism addressing specification drift. The agent deviates from user instructions, the intervention agent detects this drift, redirects the agent to follow the explicit instructions, and the agent acknowledges and executes the correct tool invocation, successfully completing the task.
Figure 2: Overview of the misbehavior prevalence metrics computation: Taxonomy creation and trajectory classification
Figure 3: The Self-Intervention system architecture. The observer runs asynchronously to prevent latency regressions in the main SWE agent loop, injecting guidance via system-reminders only when results are available.
Figure 4: Self-Intervention mechanism addressing agent infinite loops misbehavior. The diagram illustrates recovery (A) and non-recovery (B) trajectories following an intervention. Recovery occurs if the specific misbehavior (e.g., redundant calls to read_file(php_syntax.md)) does not recur. Conversely, non-recovery is defined by the reoccurrence of the same pattern of actions constituting that misbehavior. Different instances of agent misbehavior (C) can arise later, which would trigger new, separate intervention steps.
Figure 5: Self-intervention mechanism showing a tool call failure followed by intervention and recovery.

Wink: Recovering from Misbehaviors in Coding Agents

TL;DR

Abstract

Wink: Recovering from Misbehaviors in Coding Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (5)