Table of Contents
Fetching ...

CaRT: Teaching LLM Agents to Know When They Know Enough

Grace Liu, Yuxiao Qu, Jeff Schneider, Aarti Singh, Aviral Kumar

TL;DR

CaRT tackles the problem of when to stop gathering information in multi-turn tasks by training LLMs on counterfactual termination trajectories augmented with explicit verbal reasoning traces. It formalizes termination as a policy optimization problem and uses hard negative counterfactuals alongside reasoning traces to learn an implicit value function guiding termination decisions, achieving sample-efficient learning. In experiments across interactive medical diagnosis and mathematical reasoning, CaRT outperforms base models and standard fine-tuning, with robust out-of-distribution performance and more efficient use of test-time computation. The work highlights the value of comparative reasoning and counterfactual data for robust termination in open-domain LLM applications, and points to future work on unified exploration and explicit uncertainty-aware termination.

Abstract

Many tasks require learned models to strategically gather relevant information over multiple rounds of interaction before actually acting on a task. Strategic information gathering requires models to know not only how to effectively acquire information, but also when to stop gathering information and make a decision, in order to avoid overthinking or getting derailed when acting. In this paper, we formalize this problem and introduce Counterfactuals and Reasoning for Termination (CaRT), an approach for teaching LLMs when to stop seeking information. To appropriately learn when to terminate, CaRT fine-tunes LLMs using counterfactual pairs of trajectories, one where termination is appropriate and a minimally modified version of the same trajectory where it is not. It trains the LLM to explain the rationale for the termination decision in either case via verbal reasoning, and imbues this capability into the base LLM via fine-tuning. We instantiate CaRT in two domains: interactive medical diagnosis and math problem solving. In both domains, we find that CaRT improves the efficiency of information gathering and task success rate compared to other fine-tuning methods.

CaRT: Teaching LLM Agents to Know When They Know Enough

TL;DR

CaRT tackles the problem of when to stop gathering information in multi-turn tasks by training LLMs on counterfactual termination trajectories augmented with explicit verbal reasoning traces. It formalizes termination as a policy optimization problem and uses hard negative counterfactuals alongside reasoning traces to learn an implicit value function guiding termination decisions, achieving sample-efficient learning. In experiments across interactive medical diagnosis and mathematical reasoning, CaRT outperforms base models and standard fine-tuning, with robust out-of-distribution performance and more efficient use of test-time computation. The work highlights the value of comparative reasoning and counterfactual data for robust termination in open-domain LLM applications, and points to future work on unified exploration and explicit uncertainty-aware termination.

Abstract

Many tasks require learned models to strategically gather relevant information over multiple rounds of interaction before actually acting on a task. Strategic information gathering requires models to know not only how to effectively acquire information, but also when to stop gathering information and make a decision, in order to avoid overthinking or getting derailed when acting. In this paper, we formalize this problem and introduce Counterfactuals and Reasoning for Termination (CaRT), an approach for teaching LLMs when to stop seeking information. To appropriately learn when to terminate, CaRT fine-tunes LLMs using counterfactual pairs of trajectories, one where termination is appropriate and a minimally modified version of the same trajectory where it is not. It trains the LLM to explain the rationale for the termination decision in either case via verbal reasoning, and imbues this capability into the base LLM via fine-tuning. We instantiate CaRT in two domains: interactive medical diagnosis and math problem solving. In both domains, we find that CaRT improves the efficiency of information gathering and task success rate compared to other fine-tuning methods.

Paper Structure

This paper contains 27 sections, 1 equation, 7 figures, 3 tables.

Figures (7)

  • Figure 1: A schematic illustration of the termination behavior of models with and without our proposed approach. While LLMs typically fail to recognize the best points to stop thinking or questioning often either overshooting or undershooting the amount of information needed (a - top, b - left), our approach CaRT imbues them with the ability to correctly identify this point.
  • Figure 2: An example of terminating information gathering in the medical diagnosis domain.The model should terminate when there is sufficient information.
  • Figure 3: CaRT outperforms other termination methods for medical diagnosis. (a) Performance on holdout data showing CaRT outperforms the base model and SFT baseline. Confidence intervals for all models are computed over 30 evaluation runs. Confidence intervals for CaRT and SFT are computed over 3 training runs. (b) CaRT also shows superior performance on out-of-distribution dermatology diagnosis tasks.
  • Figure 4: Termination performance on Math. Performance on AIME2025 showing CaRT outperforms the base model and no reasoning approach. Confidence intervals for all models are computed over 3 training seeds and 16 evaluations.
  • Figure 5: Ablation study: termination performance on holdout data. We ablate counterfactual training data and reasoning augmentation. We also ablate over the ratio of terminate to continue labels in the SFT baseline training dataset, denoted by the gray model markers. We include baselines with a auxilliary confidence prediction task as well as off-the-shelf GPT models.
  • ...and 2 more figures