Table of Contents
Fetching ...

Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts

Brigitte A. Mora-Reyes, Jennifer A. Drewyor, Abel A. Reyes-Angulo

TL;DR

The study tackles Eurocentric biases in LLMs that marginalize Latin American contexts by constructing a culturally aware Latin American dataset and evaluating six LLMs using a Cultural Expressiveness metric defined as $CE = \alpha_1 \cdot \text{Key. Freq.} + \alpha_2 \cdot (1 - \Delta S) + \alpha_3 \cdot \text{Sem. Sim.}$, with ground-truth responses from Latin American users. It demonstrates that while some models show higher lexical alignment, sentiment and deeper cultural understanding lag, and that fine-tuning Mistral-7B with a 54-question dataset via LoRA substantially improves CE from $0.49$ to $0.70$ (a $42.9\%$ gain), driven by increased keyword usage, reduced sentiment misalignment, and better semantic similarity to user perspectives. The work highlights the value of region-specific data and targeted alignment for equitable AI and points to broader needs, including Indigenous language coverage and community-driven data creation. Overall, the framework and findings advocate for scalable, culturally grounded approaches (including RLHF) to reshape AI systems toward more accurate and representative Latin American knowledge and voices.

Abstract

Artificial intelligence (AI) systems often reflect biases from economically advanced regions, marginalizing contexts in economically developing regions like Latin America due to imbalanced datasets. This paper examines AI representations of diverse Latin American contexts, revealing disparities between data from economically advanced and developing regions. We highlight how the dominance of English over Spanish, Portuguese, and indigenous languages such as Quechua and Nahuatl perpetuates biases, framing Latin American perspectives through a Western lens. To address this, we introduce a culturally aware dataset rooted in Latin American history and socio-political contexts, challenging Eurocentric models. We evaluate six language models on questions testing cultural context awareness, using a novel Cultural Expressiveness metric, statistical tests, and linguistic analyses. Our findings show that some models better capture Latin American perspectives, while others exhibit significant sentiment misalignment (p < 0.001). Fine-tuning Mistral-7B with our dataset improves its cultural expressiveness by 42.9%, advancing equitable AI development. We advocate for equitable AI by prioritizing datasets that reflect Latin American history, indigenous knowledge, and diverse languages, while emphasizing community-centered approaches to amplify marginalized voices.

Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts

TL;DR

The study tackles Eurocentric biases in LLMs that marginalize Latin American contexts by constructing a culturally aware Latin American dataset and evaluating six LLMs using a Cultural Expressiveness metric defined as , with ground-truth responses from Latin American users. It demonstrates that while some models show higher lexical alignment, sentiment and deeper cultural understanding lag, and that fine-tuning Mistral-7B with a 54-question dataset via LoRA substantially improves CE from to (a gain), driven by increased keyword usage, reduced sentiment misalignment, and better semantic similarity to user perspectives. The work highlights the value of region-specific data and targeted alignment for equitable AI and points to broader needs, including Indigenous language coverage and community-driven data creation. Overall, the framework and findings advocate for scalable, culturally grounded approaches (including RLHF) to reshape AI systems toward more accurate and representative Latin American knowledge and voices.

Abstract

Artificial intelligence (AI) systems often reflect biases from economically advanced regions, marginalizing contexts in economically developing regions like Latin America due to imbalanced datasets. This paper examines AI representations of diverse Latin American contexts, revealing disparities between data from economically advanced and developing regions. We highlight how the dominance of English over Spanish, Portuguese, and indigenous languages such as Quechua and Nahuatl perpetuates biases, framing Latin American perspectives through a Western lens. To address this, we introduce a culturally aware dataset rooted in Latin American history and socio-political contexts, challenging Eurocentric models. We evaluate six language models on questions testing cultural context awareness, using a novel Cultural Expressiveness metric, statistical tests, and linguistic analyses. Our findings show that some models better capture Latin American perspectives, while others exhibit significant sentiment misalignment (p < 0.001). Fine-tuning Mistral-7B with our dataset improves its cultural expressiveness by 42.9%, advancing equitable AI development. We advocate for equitable AI by prioritizing datasets that reflect Latin American history, indigenous knowledge, and diverse languages, while emphasizing community-centered approaches to amplify marginalized voices.

Paper Structure

This paper contains 13 sections, 4 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Framework proposed to inject cultural context awareness into the knowledge of the LLMs.
  • Figure 2: Normalized Latin American keyword frequency per response: Users vs. LLMs. BLOOM-7B's lower frequency may be influenced by 9 missing responses.
  • Figure 3: Sentiment score distribution: Users vs. LLMs, shown as a violin plot. Sample sizes: Resp V1, Resp V2, Mistral-7B, Zephyr-7B, Llama-2-7B, Grok, ChatGPT ($n=54$); BLOOM-7B ($n=45$).
  • Figure 4: Distribution of sentiment differences (User - LLM) between averaged Latin American user responses and LLMs.
  • Figure 5: Visualization of response embeddings using t-SNE and Isomap: (a) Isomap before transformation, (b) Isomap after transformation, (c) t-SNE before transformation, (d) t-SNE after transformation. The legend indicates the classes of the dataset.