Text2Zinc: A Cross-Domain Dataset for Modeling Optimization and Satisfaction Problems in MiniZinc
Akash Singirikonda, Serdar Kadioglu, Karthik Uppuluri
TL;DR
Text2Zinc addresses the challenge of turning natural language problem descriptions into executable MiniZinc models by introducing a unified cross-domain dataset that spans optimization and satisfaction problems. The work investigates prompting strategies, chain-of-thought reasoning, and knowledge graphs as intermediate representations, revealing that current LLMs are not yet ready to serve as plug-and-play constraint-modeling co-pilots. Key findings show that reasoning-aware prompts (CoT) and compositional decomposition improve model generation, with notable gains in execution accuracy but persistent gaps in solution accuracy. The dataset and baseline results provide a foundation for future research toward solver-agnostic, language-driven modeling, and highlight the need for richer representations and larger, high-quality data to close the gap between natural language and correct MiniZinc formulations.
Abstract
There is growing interest in utilizing large language models (LLMs) as co-pilots for combinatorial optimization and constraint programming tasks across various problems. This paper aims to advance this line of research by introducing Text2Zinc}, a cross-domain dataset for capturing optimization and satisfaction problems specified in natural language text. Our work is distinguished from previous attempts by integrating both satisfaction and optimization problems within a unified dataset using a solver-agnostic modeling language. To achieve this, we leverage MiniZinc's solver-and-paradigm-agnostic modeling capabilities to formulate these problems. Using the Text2Zinc dataset, we conduct comprehensive baseline experiments to compare execution and solution accuracy across several methods, including off-the-shelf prompting strategies, chain-of-thought reasoning, and a compositional approach. Additionally, we explore the effectiveness of intermediary representations, specifically knowledge graphs. Our findings indicate that LLMs are not yet a push-button technology to model combinatorial problems from text. We hope that Text2Zinc serves as a valuable resource for researchers and practitioners to advance the field further.
