Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters
Daniil Gurgurov, Mareike Hartmann, Simon Ostermann
TL;DR
The paper tackles improving sentiment analysis and named entity recognition in eight low-resource languages by injecting graph knowledge from ConceptNet and Wikipedia into multilingual LLMs via a modular adapter framework. It adopts a MAD-X–inspired architecture with language adapters, task adapters, and Adapter Fusion to combine sources, training with MLM, full-word MLM, and a targeted MLM objective to connect multilingual concepts. Empirical results show notable SA gains in several languages and more variable improvements for NER, with benefits influenced by data availability and whether languages are present in pretraining data. The work demonstrates the potential and limits of graph-based knowledge injection for low-resource languages and provides code for reproducibility and further exploration.
Abstract
This paper explores the integration of graph knowledge from linguistic ontologies into multilingual Large Language Models (LLMs) using adapters to improve performance for low-resource languages (LRLs) in sentiment analysis (SA) and named entity recognition (NER). Building upon successful parameter-efficient fine-tuning techniques, such as K-ADAPTER and MAD-X, we propose a similar approach for incorporating knowledge from multilingual graphs, connecting concepts in various languages with each other through linguistic relationships, into multilingual LLMs for LRLs. Specifically, we focus on eight LRLs -- Maltese, Bulgarian, Indonesian, Nepali, Javanese, Uyghur, Tibetan, and Sinhala -- and employ language-specific adapters fine-tuned on data extracted from the language-specific section of ConceptNet, aiming to enable knowledge transfer across the languages covered by the knowledge graph. We compare various fine-tuning objectives, including standard Masked Language Modeling (MLM), MLM with full-word masking, and MLM with targeted masking, to analyse their effectiveness in learning and integrating the extracted graph data. Through empirical evaluation on language-specific tasks, we assess how structured graph knowledge affects the performance of multilingual LLMs for LRLs in SA and NER, providing insights into the potential benefits of adapting language models for low-resource scenarios.
