Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages
David Ifeoluwa Adelani, A. Seza Doğruöz, André Coneglian, Atul Kr. Ojha
TL;DR
This study evaluates how prompting a large language model (GPT-4) and cross-lingual transfer from English and Brazilian Portuguese perform POS tagging on 12 Brazilian low-resource languages, plus 2 African LRLs and 2 HRLs. It leverages UD-based evaluation sets and Bible corpora for language adaptation, comparing Prompting, Cross-lingual Transfer, and Language Adaptive Fine-Tuning (LAFT). The results show HRLs achieve high accuracy (>90%), while LRLs remain substantially lower (<~34%), though LAFT can boost several languages by 3–12 points; GPT-4 offers modest improvements over basic transfer. The work highlights the substantial data gaps for LRLs and suggests that targeted language resources and small-scale annotation could meaningfully enhance cross-lingual POS tagging for indigenous and low-resource languages.
Abstract
Large Language Models are transforming NLP for a variety of tasks. However, how LLMs perform NLP tasks for low-resource languages (LRLs) is less explored. In line with the goals of the AmericasNLP workshop, we focus on 12 LRLs from Brazil, 2 LRLs from Africa and 2 high-resource languages (HRLs) (e.g., English and Brazilian Portuguese). Our results indicate that the LLMs perform worse for the part of speech (POS) labeling of LRLs in comparison to HRLs. We explain the reasons behind this failure and provide an error analysis through examples observed in our data set.
