RoQLlama: A Lightweight Romanian Adapted Language Model

George-Andrei Dima; Andrei-Marius Avram; Cristian-George Crăciun; Dumitru-Clementin Cercel

RoQLlama: A Lightweight Romanian Adapted Language Model

George-Andrei Dima, Andrei-Marius Avram, Cristian-George Crăciun, Dumitru-Clementin Cercel

TL;DR

RoQLlama-7b presents a Romanian-adapted, 4-bit quantized version of Llama2-7b trained with QLoRA to address resource constraints in non-English NLP. By aggregating Romanian corpora (RoWiki, RoTex, OSCAR, CC-100) and applying LoRA-based fine-tuning, the model achieves competitive zero-shot results across seven Romanian tasks and significantly reduces memory footprint (≈3x) compared to the full-precision base model. The work introduces RoMedQA, a 4,127-question Romanian medical exam dataset, and provides detailed evaluations across RoQA, REDv2, MOROCO, SaRoCo, RoSum, and RoSTS, highlighting strengths in certain tasks and gaps in others. Overall, RoQLlama-7b demonstrates the viability of 4-bit quantization and PEFT for high-resource Romania-language NLP, with practical implications for resource-constrained deployment and further benchmarks for Romanian NLP research.

Abstract

The remarkable achievements obtained by open-source large language models (LLMs) in recent years have predominantly been concentrated on tasks involving the English language. In this paper, we aim to advance the performance of Llama2 models on Romanian tasks. We tackle the problem of reduced computing resources by using QLoRA for training. We release RoQLlama-7b, a quantized LLM, which shows equal or improved results compared to its full-sized counterpart when tested on seven Romanian downstream tasks in the zero-shot setup. Also, it consistently achieves higher average scores across all few-shot prompts. Additionally, we introduce a novel Romanian dataset, namely RoMedQA, which contains single-choice medical questions in Romanian.

RoQLlama: A Lightweight Romanian Adapted Language Model

TL;DR

Abstract

RoQLlama: A Lightweight Romanian Adapted Language Model

Authors

TL;DR

Abstract

Table of Contents

Figures (3)