Word Ladders: A Mobile Application for Semantic Data Collection

Marianna Marcella Bolognesi; Claudia Collacciani; Andrea Ferrari; Francesca Genovese; Tommaso Lamarra; Adele Loia; Giulia Rambelli; Andrea Amelio Ravelli; Caterina Villani

Word Ladders: A Mobile Application for Semantic Data Collection

Marianna Marcella Bolognesi, Claudia Collacciani, Andrea Ferrari, Francesca Genovese, Tommaso Lamarra, Adele Loia, Giulia Rambelli, Andrea Amelio Ravelli, Caterina Villani

TL;DR

Word Ladders presents a gamified mobile platform for collecting hierarchical semantic data via IS-A word ladders in English and Italian, targeting both linguistic resources and cognitive research. The system uses a React Native frontend, a NodeJS/MongoDB backend, and AWS hosting to gather anonymized sociolinguistic data and to construct specificity metrics and a hierarchical taxonomy; ladder quality is scored against MultiWordNet with a formula that balances validated and novel entries, plus a time-based bonus. The paper details the game rules, data architecture, and preliminary analyses (roughly 30k games from ~3k users in six months), and demonstrates educational deployment in Italian schools alongside plans to scale English data and compare human vs. LLM categorizations. Overall, Word Ladders offers a scalable approach to generating cross-language lexical resources and probing cognition and readability, with practical impact for NLP tasks and educational vocabulary training.

Abstract

Word Ladders is a free mobile application for Android and iOS, developed for collecting linguistic data, specifically lists of words related to each other through semantic relations of categorical inclusion, within the Abstraction project (ERC-2021-STG-101039777). We hereby provide an overview of Word Ladders, explaining its game logic, motivation and expected results and applications to nlp tasks as well as to the investigation of cognitive scientific open questions

Word Ladders: A Mobile Application for Semantic Data Collection

TL;DR

Abstract

Paper Structure (15 sections, 2 equations, 3 figures, 2 tables)

This paper contains 15 sections, 2 equations, 3 figures, 2 tables.

Introduction
Related Works
WORD LADDERS
Game Logic
Motivations
Relevance for Language Sciences and NLP
Educational Applications
App Documentation and Data Collection
Service Implementation
Game levels
Data structure
Ladder evaluation
Preliminary Data Analyses
Conclusion
Bibliographical References

Figures (3)

Figure 1: Schematic representation of the workflow of data collection and processing. (a) Data are collected through Word Ladders, anonymized and stored into a AWS server. (b) Stored data are accessed using Postman API and converted to generate a graph. (c) The resulting graph is post-processed to detect typos and remove noisy ladders. (d) The final graph is used to i.) understand the semantic organization of users (of different sociodemographic backgrounds) and ii.) compute the Specificity rating for the given words.
Figure 2: Education (a) and profession (b) information about Word Ladders users.
Figure 3: Cumulative counts for number of users (a) and number of played games (b).

Word Ladders: A Mobile Application for Semantic Data Collection

TL;DR

Abstract

Word Ladders: A Mobile Application for Semantic Data Collection

Authors

TL;DR

Abstract

Table of Contents

Figures (3)