Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages
Israel Abebe Azime, Tadesse Destaw Belay, Dietrich Klakow, Philipp Slusallek, Anshuman Chhabra
TL;DR
The paper tackles the problem of evaluating and improving LLM performance on culturally grounded math word problems in low-resource languages by introducing an automated socio-cultural localization framework that replaces English-centric entities with native names, organizations, and currencies. It presents a detailed pipeline with entity classification, local entity databases, and one-shot LLM-based localization plus quality checks, and evaluates on GSM8K/AfriMGSM data using translations generated with NLLB-200 and screened by COMMET, focusing on robustness to entity variations. The study finds that translations alone can obscure true multilingual math ability, while locale-aware variants reveal performance disparities and can improve robustness when used to augment training data; the gains are language- and model-dependent. Overall, the framework enables scalable creation of culturally aligned MWPs, reduces English-centric biases, and offers a path toward more culturally faithful benchmarks and multilingual reasoning capabilities in LLMs.
Abstract
Large language models (LLMs) have demonstrated significant capabilities in solving mathematical problems expressed in natural language. However, multilingual and culturally-grounded mathematical reasoning in low-resource languages lags behind English due to the scarcity of socio-cultural task datasets that reflect accurate native entities such as person names, organization names, and currencies. Existing multilingual benchmarks are predominantly produced via translation and typically retain English-centric entities, owing to the high cost associated with human annotater-based localization. Moreover, automated localization tools are limited, and hence, truly localized datasets remain scarce. To bridge this gap, we introduce a framework for LLM-driven cultural localization of math word problems that automatically constructs datasets with native names, organizations, and currencies from existing sources. We find that translated benchmarks can obscure true multilingual math ability under appropriate socio-cultural contexts. Through extensive experiments, we also show that our framework can help mitigate English-centric entity bias and improves robustness when native entities are introduced across various languages.
