Aligning Large Language Models with Procedural Rules: An Autoregressive State-Tracking Prompting for In-Game Trading
Minkyung Kim, Junsik Kim, Woongcheol Yang, Sangdon Park, Sohee Bae
TL;DR
This paper tackles the challenge of reconciling LLM-driven conversational flexibility with strict procedural constraints in in-game trading. It introduces Autoregressive State-Tracking Prompting (ASTP), a prompting framework that makes dialogue state tracking explicit and verifiable through a Prime–Guide–Enforce workflow and explicit reporting of the previous state. A four-element design embeds state definitions, transition conditions, previous-state identification, and enforcement into the prompt, achieving near-perfect state-transition compliance and robust transaction integrity. A placeholder-based post-processing (PPP) mechanism further enhances numerical reliability, enabling smaller models to match larger-model performance while delivering substantial latency reductions. Overall, ASTP demonstrates a practical path to reliable, real-time, rule-governed NPC trading and offers a foundation for broader applications requiring expressive yet compliant language interactions.
Abstract
Large Language Models (LLMs) enable dynamic game interactions but fail to follow essential procedural flows in rule-governed trading systems, eroding player trust. This work resolves the core tension between the creative flexibility of LLMs and the procedural demands of in-game trading (browse-offer-review-confirm). To this end, Autoregressive State-Tracking Prompting (ASTP) is introduced, a methodology centered on a strategically orchestrated prompt that compels an LLM to make its state-tracking process explicit and verifiable. Instead of relying on implicit contextual understanding, ASTP tasks the LLM with identifying and reporting a predefined state label from the previous turn. To ensure transactional integrity, this is complemented by a state-specific placeholder post-processing method for accurate price calculations. Evaluation across 300 trading dialogues demonstrates >99% state compliance and 99.3% calculation precision. Notably, ASTP with placeholder post-processing on smaller models (Gemini-2.5-Flash) matches larger models' (Gemini-2.5-Pro) performance while reducing response time from 21.2s to 2.4s, establishing a practical foundation that satisfies both real-time requirements and resource constraints of commercial games.
