TRepLiNa: Layer-wise CKA+REPINA Alignment Improves Low-Resource Machine Translation in Aya-23 8B
Toshiki Nakai, Ravi Kiran Chikkala, Lena Sophie Oberkircher, Nicholas Jennings, Natalia Skachkova, Tatiana Anikina, Jesujoba Oluwadara Alabi
TL;DR
<3-5 sentence high-level summary> The paper tackles the challenge of translating from low-resource languages (LRLs) to high-resource languages (HRLs) in a setting with limited data. It introduces TRepLiNa, a layer-wise alignment method that combines Centered Kernel Alignment (CKA) with REPINA stability to align mid-layer representations in a decoder-only LLM (Aya-23 8B) and improve LRL→HRL translation, especially under zero-shot, few-shot, and small-data fine-tuning. Across Mundari, Santali, Bhili, and Gondi with Hindi/English pivots, TRepLiNa demonstrates robust gains at mid-layers (around 10–15), often outperforming CKA alone or REPINA-only, and yields state-of-the-art-like results on several targets under the MMLoSo benchmark. The work provides practical guidelines on when and where to apply layer-wise alignment in low-resource MT and highlights potential limitations and directions for extending the approach to other model families and data regimes.
Abstract
The 2025 Multimodal Models for Low-Resource Contexts and Social Impact (MMLoSo) Language Challenge addresses one of India's most pressing linguistic gaps: the lack of resources for its diverse low-resource languages (LRLs). In this study, we investigate whether enforcing cross-lingual similarity in specific internal layers of a decoder-only multilingual large language model (LLM) can improve translation quality from LRL to high-resource language (HRL). Specifically, we combine Centered Kernel Alignment (CKA), a similarity metric that encourages representations of different languages to align, with REPINA, a regularization method that constrains parameter updates to remain close to the pretrained model, into a joint method we call TRepLiNa. In this research project, we experiment with zero-shot, few-shot, and fine-tuning settings using Aya-23 8B with QLoRA across MMLoSo shared task language pairs (Mundari, Santali, Bhili) with Hindi/English pivots. Our results show that aligning mid-level layers using TRepLiNa (CKA+REPINA) is a low-cost, practical approach to improving LRL translation, especially in data-scarce settings.
