In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning

Youngbin Choi; Minjong Lee; Saemi Moon; Seunghyuk Cho; Chaehyeon Chung; MoonJeong Park; Dongwoo Kim

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning

Youngbin Choi, Minjong Lee, Saemi Moon, Seunghyuk Cho, Chaehyeon Chung, MoonJeong Park, Dongwoo Kim

TL;DR

This work introduces in-place feedback, a state-repair paradigm for guiding LLMs in multi-turn reasoning by allowing users to directly edit the model's previous output and continuing generation from the edited state. It identifies three failure modes of traditional multi-turn refinement and demonstrates that in-place feedback improves task accuracy while reducing token usage by about 79.1% across GPQA, MMLU-pro, and MATH-hard benchmarks. Through controlled ZebraLogic experiments, the authors show that in-place edits better preserve correct reasoning, sustain feedback incorporation over turns, and limit error propagation. The results suggest that in-place feedback is a more natural, efficient, and scalable mechanism for guiding LLMs in reasoning-intensive tasks with wide potential applicability.

Abstract

Large language models (LLMs) are increasingly studied in the context of multi-turn reasoning, where models iteratively refine their outputs based on user-provided feedback. Such settings are crucial for tasks that require complex reasoning, yet existing feedback paradigms often rely on issuing new messages. LLMs struggle to integrate these reliably, leading to inconsistent improvements. In this work, we introduce in-place feedback, a novel interaction paradigm in which users directly edit an LLM's previous response, and the model conditions on this modified response to generate its revision. Empirical evaluations on diverse reasoning-intensive benchmarks reveal that in-place feedback achieves better performance than conventional multi-turn feedback while using $79.1\%$ fewer tokens. Complementary analyses on controlled environments further demonstrate that in-place feedback resolves a core limitation of multi-turn feedback: models often fail to apply feedback precisely to erroneous parts of the response, leaving errors uncorrected and sometimes introducing new mistakes into previously correct content. These findings suggest that in-place feedback offers a more natural and effective mechanism for guiding LLMs in reasoning-intensive tasks.

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning

TL;DR

Abstract

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (23)