Table of Contents
Fetching ...

Autoformalization of Game Descriptions using Large Language Models

Agnieszka Mensfelt, Kostas Stathis, Vince Trencsenyi

TL;DR

This work tackles the challenge of converting natural-language game descriptions into formal representations usable by logic-based solvers. It introduces an autoformalization framework that couples a Prolog-based solver (with a Situation Calculus foundation) to a GPT-4o language model via one-shot prompting, using solver feedback to refine translations. A dataset of 105 NL game descriptions (standard, non-standard, with/without numerical payoffs) is used to evaluate the approach, achieving 98% syntactic correctness and 88% semantic correctness overall, with strong performance on standard cases and notable generalization to sequential and multi-move games. The results demonstrate that LLMs can bridge real-world strategic scenarios and formal reasoning, enabling automatic generation of game specifications for formal solvers and potentially broadening practical applications of game theory.

Abstract

Game theory is a powerful framework for reasoning about strategic interactions, with applications in domains ranging from day-to-day life to international politics. However, applying formal reasoning tools in such contexts is challenging, as these scenarios are often expressed in natural language. To address this, we introduce a framework for the autoformalization of game-theoretic scenarios, which translates natural language descriptions into formal logic representations suitable for formal solvers. Our approach utilizes one-shot prompting and a solver that provides feedback on syntactic correctness to allow LLMs to refine the code. We evaluate the framework using GPT-4o and a dataset of natural language problem descriptions, achieving 98% syntactic correctness and 88% semantic correctness. These results show the potential of LLMs to bridge the gap between real-life strategic interactions and formal reasoning.

Autoformalization of Game Descriptions using Large Language Models

TL;DR

This work tackles the challenge of converting natural-language game descriptions into formal representations usable by logic-based solvers. It introduces an autoformalization framework that couples a Prolog-based solver (with a Situation Calculus foundation) to a GPT-4o language model via one-shot prompting, using solver feedback to refine translations. A dataset of 105 NL game descriptions (standard, non-standard, with/without numerical payoffs) is used to evaluate the approach, achieving 98% syntactic correctness and 88% semantic correctness overall, with strong performance on standard cases and notable generalization to sequential and multi-move games. The results demonstrate that LLMs can bridge real-world strategic scenarios and formal reasoning, enabling automatic generation of game specifications for formal solvers and potentially broadening practical applications of game theory.

Abstract

Game theory is a powerful framework for reasoning about strategic interactions, with applications in domains ranging from day-to-day life to international politics. However, applying formal reasoning tools in such contexts is challenging, as these scenarios are often expressed in natural language. To address this, we introduce a framework for the autoformalization of game-theoretic scenarios, which translates natural language descriptions into formal logic representations suitable for formal solvers. Our approach utilizes one-shot prompting and a solver that provides feedback on syntactic correctness to allow LLMs to refine the code. We evaluate the framework using GPT-4o and a dataset of natural language problem descriptions, achieving 98% syntactic correctness and 88% semantic correctness. These results show the potential of LLMs to bridge the gap between real-life strategic interactions and formal reasoning.
Paper Structure (28 sections, 1 figure, 4 tables, 1 algorithm)

This paper contains 28 sections, 1 figure, 4 tables, 1 algorithm.

Figures (1)

  • Figure 1: Semantic accuracy of generated predicates for each of the games. "BS", "HD", "MP", "PD", and "SH" stand for Battle of the Sexes, Hawk-Dove, Matching Pennies, Prisoner's Dilemma, and Stag-Hunt, respectively.