Table of Contents
Fetching ...

From Brazilian Portuguese to European Portuguese

João Sanches, Rui Ribeiro, Luísa Coheur

TL;DR

The paper tackles translating Brazilian Portuguese to European Portuguese, addressing resource inequity between variants and the lack of direct bp↔ep translation tools. It investigates fine-tuning multilingual NMT models (M2M100-418M, mBART-large-50) and compares them with ChatGPT 3.5 Turbo using parallel data from TED Talks and subtitles, complemented by a gold standard of 500 bp-ep sentences across five domains. The findings show that merged-domain fine-tuning with MBART-50 often yields strong fine-tuned performance, while ChatGPT demonstrates robust domain-general translation on the Gold Collection, highlighting both the potential and the limits of current models in addressing this language resource gap. The released datasets and evaluation framework enable standardized benchmarking for bp-ep translation and provide a foundation for further improvements in cross-variant Portuguese translation.

Abstract

Brazilian Portuguese and European Portuguese are two varieties of the same language and, despite their close similarities, they exhibit several differences. However, there is a significant disproportion in the availability of resources between the two variants, with Brazilian Portuguese having more abundant resources. This inequity can impact the quality of translation services accessible to European Portuguese speakers. To address this issue, we propose the development of a Brazilian Portuguese to European Portuguese translation system, leveraging recent advancements in neural architectures and models. To evaluate the performance of such systems, we manually curated a gold test set comprising 500 sentences across five different topics. Each sentence in the gold test set has two distinct references, facilitating a straightforward evaluation of future translation models. We experimented with various models by fine-tuning existing Large Language Models using parallel data extracted from movie subtitles and TED Talks transcripts in both Brazilian and European Portuguese. Our evaluation involved the use of conventional automatic metrics as well as a human evaluation. In addition, all models were compared against ChatGPT 3.5 Turbo, which currently yields the best results.

From Brazilian Portuguese to European Portuguese

TL;DR

The paper tackles translating Brazilian Portuguese to European Portuguese, addressing resource inequity between variants and the lack of direct bp↔ep translation tools. It investigates fine-tuning multilingual NMT models (M2M100-418M, mBART-large-50) and compares them with ChatGPT 3.5 Turbo using parallel data from TED Talks and subtitles, complemented by a gold standard of 500 bp-ep sentences across five domains. The findings show that merged-domain fine-tuning with MBART-50 often yields strong fine-tuned performance, while ChatGPT demonstrates robust domain-general translation on the Gold Collection, highlighting both the potential and the limits of current models in addressing this language resource gap. The released datasets and evaluation framework enable standardized benchmarking for bp-ep translation and provide a foundation for further improvements in cross-variant Portuguese translation.

Abstract

Brazilian Portuguese and European Portuguese are two varieties of the same language and, despite their close similarities, they exhibit several differences. However, there is a significant disproportion in the availability of resources between the two variants, with Brazilian Portuguese having more abundant resources. This inequity can impact the quality of translation services accessible to European Portuguese speakers. To address this issue, we propose the development of a Brazilian Portuguese to European Portuguese translation system, leveraging recent advancements in neural architectures and models. To evaluate the performance of such systems, we manually curated a gold test set comprising 500 sentences across five different topics. Each sentence in the gold test set has two distinct references, facilitating a straightforward evaluation of future translation models. We experimented with various models by fine-tuning existing Large Language Models using parallel data extracted from movie subtitles and TED Talks transcripts in both Brazilian and European Portuguese. Our evaluation involved the use of conventional automatic metrics as well as a human evaluation. In addition, all models were compared against ChatGPT 3.5 Turbo, which currently yields the best results.
Paper Structure (15 sections, 1 equation, 8 tables)