Multilingual Sentence-T5: Scalable Sentence Encoders for Multilingual Applications
Chihiro Yano, Akihiko Fukuchi, Shoko Fukasawa, Hideyuki Tachibana, Yotaro Watanabe
TL;DR
This work scales multilingual sentence embedding by extending Sentence T5 to m-ST5, a 5.7B-parameter encoder derived from mT5 and trained with NLI-based contrastive learning using LoRA for scalable fine-tuning. It demonstrates strong cross-lingual retrieval and cross-lingual STS performance, outperforming prior NLI-based methods like mSimCSE, with notable gains for low-resource and linguistically distant languages. The study confirms a scaling law: larger models yield improved multilingual alignment, especially when cross-lingual data is leveraged, and shows that monolingual transfer can become more effective at scale. The model's practical value is underscored by competitive performance across Tatoeba, BUCC, and XSTS benchmarks and the intention to release the trained m-ST5 model for public use.
Abstract
Prior work on multilingual sentence embedding has demonstrated that the efficient use of natural language inference (NLI) data to build high-performance models can outperform conventional methods. However, the potential benefits from the recent ``exponential'' growth of language models with billions of parameters have not yet been fully explored. In this paper, we introduce Multilingual Sentence T5 (m-ST5), as a larger model of NLI-based multilingual sentence embedding, by extending Sentence T5, an existing monolingual model. By employing the low-rank adaptation (LoRA) technique, we have achieved a successful scaling of the model's size to 5.7 billion parameters. We conducted experiments to evaluate the performance of sentence embedding and verified that the method outperforms the NLI-based prior approach. Furthermore, we also have confirmed a positive correlation between the size of the model and its performance. It was particularly noteworthy that languages with fewer resources or those with less linguistic similarity to English benefited more from the parameter increase. Our model is available at https://huggingface.co/pkshatech/m-ST5.
