Table of Contents
Fetching ...

FOR-Prompting: From Objection to Revision via an Asymmetric Prompting Protocol

He Zhang, Anzhou Zhang, Jian Dai

TL;DR

FOR-Prompting introduces an asymmetric prompting protocol that externalizes critique through a Defender–Debater–Host loop, where the Debater poses questions and the Defender revises without receiving direct fixes. This design preserves a single, accountable reasoning chain while leveraging external pressure to surface gaps, assumptions, and constraints, enabling automated HITL-style revision. On GSM8K, FOR-Prompting achieves about a 22% accuracy gain over single-prompt and matches CoT in accuracy while surpassing it in reasoning and coherence, with similar benefits observed for small open-source models (e.g., 1B LLaMA) and open-ended tasks. The approach is model-agnostic, scalable, and well-suited for on-device or constrained deployments, offering a controlled, interpretable mechanism to study and harness external questioning for improved AI reasoning.

Abstract

Reasoning protocols such as Chain of Thought (CoT) and Tree of Thought (ToT) organize internal deliberation but lack an explicit mechanism for external questioning that elicits self-revision. We present FOR-Prompting (From Objection to Revision Prompting), an asymmetric protocol where a Defender proposes an answer, an Objectioner raises question-style objections with no direct fixes, and a Host enforces consistency and closure. On GSM8K we observe about a 22% point gain over single-prompt and accuracy on par with CoT, with more than 10% higher ratings in reasoning and coherence from a uniform GPT 4.1 judge. FOR-Prompting also corrects mistakes without tools or human supervision on tricky queries, and improves performance for small-scale model (approx. 19% accuracy improved on Llama3.2:1b for GSM8K task), highlighting promise for small models and on personal device use. Beyond factual QA, qualitative analyses on open-ended tasks show enhanced exploration and refinement, with dialogue traces that make assumptions and trade-offs explicit. The protocol is model agnostic and operates purely at the prompt level through role-structured turns, so it works with hosted and local models of different sizes without retraining, and it supports large-scale study of objection-guided reasoning.

FOR-Prompting: From Objection to Revision via an Asymmetric Prompting Protocol

TL;DR

FOR-Prompting introduces an asymmetric prompting protocol that externalizes critique through a Defender–Debater–Host loop, where the Debater poses questions and the Defender revises without receiving direct fixes. This design preserves a single, accountable reasoning chain while leveraging external pressure to surface gaps, assumptions, and constraints, enabling automated HITL-style revision. On GSM8K, FOR-Prompting achieves about a 22% accuracy gain over single-prompt and matches CoT in accuracy while surpassing it in reasoning and coherence, with similar benefits observed for small open-source models (e.g., 1B LLaMA) and open-ended tasks. The approach is model-agnostic, scalable, and well-suited for on-device or constrained deployments, offering a controlled, interpretable mechanism to study and harness external questioning for improved AI reasoning.

Abstract

Reasoning protocols such as Chain of Thought (CoT) and Tree of Thought (ToT) organize internal deliberation but lack an explicit mechanism for external questioning that elicits self-revision. We present FOR-Prompting (From Objection to Revision Prompting), an asymmetric protocol where a Defender proposes an answer, an Objectioner raises question-style objections with no direct fixes, and a Host enforces consistency and closure. On GSM8K we observe about a 22% point gain over single-prompt and accuracy on par with CoT, with more than 10% higher ratings in reasoning and coherence from a uniform GPT 4.1 judge. FOR-Prompting also corrects mistakes without tools or human supervision on tricky queries, and improves performance for small-scale model (approx. 19% accuracy improved on Llama3.2:1b for GSM8K task), highlighting promise for small models and on personal device use. Beyond factual QA, qualitative analyses on open-ended tasks show enhanced exploration and refinement, with dialogue traces that make assumptions and trade-offs explicit. The protocol is model agnostic and operates purely at the prompt level through role-structured turns, so it works with hosted and local models of different sizes without retraining, and it supports large-scale study of objection-guided reasoning.

Paper Structure

This paper contains 54 sections, 2 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: FOR-Prompting protocol flow (external questions only). The Debater raises questions, the Defender revises answers, and the Host synthesizes the final output.
  • Figure 2: Evaluation results on GSM8K across three representative methods: single-prompt, CoT, and FOR-Prompting. Reported metrics include accuracy, coherence, reasoning quality, and holistic GPT-4.1 evaluation scores.
  • Figure 3: Accuracy of 1B Llama-3.2:1B model on 1,319 GSM8K problems: single‑prompt vs. one‑round and three‑round external FOR‑Prompting. One questioning round yields most of the improvement; three rounds add only marginal gain.
  • Figure 4: A detailed example of FOR-Prompting with initial task, incorrect single-prompt answer, debater’s questions, and corrected defender response.
  • Figure : FOR-Prompting Protocol (abstracted pseudocode)