Table of Contents
Fetching ...

PORTULAN ExtraGLUE Datasets and Models: Kick-starting a Benchmark for the Neural Processing of Portuguese

Tomás Osório, Bernardo Leite, Henrique Lopes Cardoso, Luís Gomes, João Rodrigues, Rodrigo Santos, António Branco

TL;DR

The paper addresses the scarcity of comprehensive benchmarks for Portuguese NLP by introducing PORTULAN ExtraGLUE, a suite of GLUE/SuperGLUE-derived tasks translated into pt-PT and pt-BR using DeepL and complemented by open-source Albertina LoRA baselines. It provides 14 tasks across single-sentence, similarity, inference, QA, and reasoning categories, along with error analyses of translation quality and label integrity. The authors fine-tune Albertina 1.5B models with LoRA adapters for both Portuguese variants and demonstrate favorable performance against multilingual baselines, establishing strong first baselines for Portuguese tasks. They also critically discuss limitations of machine-translated data, including pronoun resolution and named-entity translation, and propose future enhancements such as manual curation and a leaderboard to advance Portuguese NLP benchmarking.

Abstract

Leveraging research on the neural modelling of Portuguese, we contribute a collection of datasets for an array of language processing tasks and a corresponding collection of fine-tuned neural language models on these downstream tasks. To align with mainstream benchmarks in the literature, originally developed in English, and to kick start their Portuguese counterparts, the datasets were machine-translated from English with a state-of-the-art translation engine. The resulting PORTULAN ExtraGLUE benchmark is a basis for research on Portuguese whose improvement can be pursued in future work. Similarly, the respective fine-tuned neural language models, developed with a low-rank adaptation approach, are made available as baselines that can stimulate future work on the neural processing of Portuguese. All datasets and models have been developed and are made available for two variants of Portuguese: European and Brazilian.

PORTULAN ExtraGLUE Datasets and Models: Kick-starting a Benchmark for the Neural Processing of Portuguese

TL;DR

The paper addresses the scarcity of comprehensive benchmarks for Portuguese NLP by introducing PORTULAN ExtraGLUE, a suite of GLUE/SuperGLUE-derived tasks translated into pt-PT and pt-BR using DeepL and complemented by open-source Albertina LoRA baselines. It provides 14 tasks across single-sentence, similarity, inference, QA, and reasoning categories, along with error analyses of translation quality and label integrity. The authors fine-tune Albertina 1.5B models with LoRA adapters for both Portuguese variants and demonstrate favorable performance against multilingual baselines, establishing strong first baselines for Portuguese tasks. They also critically discuss limitations of machine-translated data, including pronoun resolution and named-entity translation, and propose future enhancements such as manual curation and a leaderboard to advance Portuguese NLP benchmarking.

Abstract

Leveraging research on the neural modelling of Portuguese, we contribute a collection of datasets for an array of language processing tasks and a corresponding collection of fine-tuned neural language models on these downstream tasks. To align with mainstream benchmarks in the literature, originally developed in English, and to kick start their Portuguese counterparts, the datasets were machine-translated from English with a state-of-the-art translation engine. The resulting PORTULAN ExtraGLUE benchmark is a basis for research on Portuguese whose improvement can be pursued in future work. Similarly, the respective fine-tuned neural language models, developed with a low-rank adaptation approach, are made available as baselines that can stimulate future work on the neural processing of Portuguese. All datasets and models have been developed and are made available for two variants of Portuguese: European and Brazilian.
Paper Structure (17 sections, 3 tables)