Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts
Brigitte A. Mora-Reyes, Jennifer A. Drewyor, Abel A. Reyes-Angulo
TL;DR
The study tackles Eurocentric biases in LLMs that marginalize Latin American contexts by constructing a culturally aware Latin American dataset and evaluating six LLMs using a Cultural Expressiveness metric defined as $CE = \alpha_1 \cdot \text{Key. Freq.} + \alpha_2 \cdot (1 - \Delta S) + \alpha_3 \cdot \text{Sem. Sim.}$, with ground-truth responses from Latin American users. It demonstrates that while some models show higher lexical alignment, sentiment and deeper cultural understanding lag, and that fine-tuning Mistral-7B with a 54-question dataset via LoRA substantially improves CE from $0.49$ to $0.70$ (a $42.9\%$ gain), driven by increased keyword usage, reduced sentiment misalignment, and better semantic similarity to user perspectives. The work highlights the value of region-specific data and targeted alignment for equitable AI and points to broader needs, including Indigenous language coverage and community-driven data creation. Overall, the framework and findings advocate for scalable, culturally grounded approaches (including RLHF) to reshape AI systems toward more accurate and representative Latin American knowledge and voices.
Abstract
Artificial intelligence (AI) systems often reflect biases from economically advanced regions, marginalizing contexts in economically developing regions like Latin America due to imbalanced datasets. This paper examines AI representations of diverse Latin American contexts, revealing disparities between data from economically advanced and developing regions. We highlight how the dominance of English over Spanish, Portuguese, and indigenous languages such as Quechua and Nahuatl perpetuates biases, framing Latin American perspectives through a Western lens. To address this, we introduce a culturally aware dataset rooted in Latin American history and socio-political contexts, challenging Eurocentric models. We evaluate six language models on questions testing cultural context awareness, using a novel Cultural Expressiveness metric, statistical tests, and linguistic analyses. Our findings show that some models better capture Latin American perspectives, while others exhibit significant sentiment misalignment (p < 0.001). Fine-tuning Mistral-7B with our dataset improves its cultural expressiveness by 42.9%, advancing equitable AI development. We advocate for equitable AI by prioritizing datasets that reflect Latin American history, indigenous knowledge, and diverse languages, while emphasizing community-centered approaches to amplify marginalized voices.
