Fine Tuning Methods for Low-resource Languages
Tim Bakkenes, Daniel Wang, Anton Johansson
TL;DR
This work tackles the underrepresentation of non-English languages in large language models by fine-tuning Google's Gemma 2 for Swedish using a hybrid approach that combines LoRA-based parameter-efficient fine-tuning with Retrieval-Augmented Generation (RAG). It builds two Swedish-focused datasets (a fine-tuning set and a RAG knowledge corpus) and evaluates the model across QA, summarization, and translation tasks using EM, F1, ROUGE, BLEU, METEOR, BERTScore, and COMET. The results show improvements across several tasks, especially when leveraging RAG and pretrained Swedish embeddings, while highlighting overfitting and dataset limitations as key challenges. The study provides a practical blueprint for communities seeking to adapt LLMs to local languages, supporting cultural preservation and inclusive AI deployment, and discusses the trade-offs between compute costs, data quality, and model size. The work also discusses future enhancements such as larger curated datasets, reinforcement learning with human feedback, and more rigorous cross-language evaluations to enhance generalization and trust in multilingual AI systems.
Abstract
The rise of Large Language Models has not been inclusive of all cultures. The models are mostly trained on English texts and culture which makes them underperform in other languages and cultural contexts. By developing a generalizable method for preparing culturally relevant datasets and post-training the Gemma 2 model, this project aimed to increase the performance of Gemma 2 for an underrepresented language and showcase how others can do the same to unlock the power of Generative AI in their country and preserve their cultural heritage.
