Word Ladders: A Mobile Application for Semantic Data Collection
Marianna Marcella Bolognesi, Claudia Collacciani, Andrea Ferrari, Francesca Genovese, Tommaso Lamarra, Adele Loia, Giulia Rambelli, Andrea Amelio Ravelli, Caterina Villani
TL;DR
Word Ladders presents a gamified mobile platform for collecting hierarchical semantic data via IS-A word ladders in English and Italian, targeting both linguistic resources and cognitive research. The system uses a React Native frontend, a NodeJS/MongoDB backend, and AWS hosting to gather anonymized sociolinguistic data and to construct specificity metrics and a hierarchical taxonomy; ladder quality is scored against MultiWordNet with a formula that balances validated and novel entries, plus a time-based bonus. The paper details the game rules, data architecture, and preliminary analyses (roughly 30k games from ~3k users in six months), and demonstrates educational deployment in Italian schools alongside plans to scale English data and compare human vs. LLM categorizations. Overall, Word Ladders offers a scalable approach to generating cross-language lexical resources and probing cognition and readability, with practical impact for NLP tasks and educational vocabulary training.
Abstract
Word Ladders is a free mobile application for Android and iOS, developed for collecting linguistic data, specifically lists of words related to each other through semantic relations of categorical inclusion, within the Abstraction project (ERC-2021-STG-101039777). We hereby provide an overview of Word Ladders, explaining its game logic, motivation and expected results and applications to nlp tasks as well as to the investigation of cognitive scientific open questions
