Overview of the First Workshop on Language Models for Low-Resource Languages (LoResLM 2025)
Hansi Hettiarachchi, Tharindu Ranasinghe, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
TL;DR
The paper introduces LoResLM 2025, the first workshop dedicated to language models for low-resource languages, held in conjunction with COLING 2025 in Abu Dhabi. It frames the problem around data scarcity and high-resource language bias in NLP, and outlines the workshop’s goals, submission process, and scope. Results show 35 accepted papers from 52 submissions, spanning eight language families and 13 NLP areas, with Language Modelling and Machine Translation leading the program. The findings reveal Indo-European languages as the most represented group and underscore the need for broader linguistic representation, domain diversity, and new application areas to advance inclusive NLP for millions of low-resource languages.
Abstract
The first Workshop on Language Models for Low-Resource Languages (LoResLM 2025) was held in conjunction with the 31st International Conference on Computational Linguistics (COLING 2025) in Abu Dhabi, United Arab Emirates. This workshop mainly aimed to provide a forum for researchers to share and discuss their ongoing work on language models (LMs) focusing on low-resource languages, following the recent advancements in neural language models and their linguistic biases towards high-resource languages. LoResLM 2025 attracted notable interest from the natural language processing (NLP) community, resulting in 35 accepted papers from 52 submissions. These contributions cover a broad range of low-resource languages from eight language families and 13 diverse research areas, paving the way for future possibilities and promoting linguistic inclusivity in NLP.
