A Turkish Educational Crossword Puzzle Generator
Kamyar Zeinalipour, Yusuf Gökberk Keptiğ, Marco Maggini, Leonardo Rigutini, Marco Gori
TL;DR
The paper addresses the lack of Turkish-language educational crossword tools and demonstrates how large language models can autogenerate clues and layouts. It introduces two datasets—TAC for Turkish answer-clue pairs and T4TAC for text-based clue generation with category labels—plus a full crossword-generation system that supports keyword- and text-driven inputs. Fine-tuning GPT-3.5-Turbo and Llama-2 models on these datasets, combined with a schema-driven layout algorithm and a scoring rule $Score = (FW + 0.5 \\cdot LL) \\times FR \\times LR$, yields clues and puzzles of educational quality, with human evaluators confirming meaningful performance. The work contributes open datasets and an accessible tool for Turkish education, and future work aims to extend to more languages and more advanced clue-generation capabilities.
Abstract
This paper introduces the first Turkish crossword puzzle generator designed to leverage the capabilities of large language models (LLMs) for educational purposes. In this work, we introduced two specially created datasets: one with over 180,000 unique answer-clue pairs for generating relevant clues from the given answer, and another with over 35,000 samples containing text, answer, category, and clue data, aimed at producing clues for specific texts and keywords within certain categories. Beyond entertainment, this generator emerges as an interactive educational tool that enhances memory, vocabulary, and problem-solving skills. It's a notable step in AI-enhanced education, merging game-like engagement with learning for Turkish and setting new standards for interactive, intelligent learning tools in Turkish.
