Table of Contents
Fetching ...

Domain Adaptation of Multilingual Semantic Search -- Literature Review

Anna Bringmann, Anastasia Zhukova

TL;DR

This paper examines how to perform domain adaptation for multilingual semantic search in low-resource settings. It introduces a systematic, component-based typology that clusters domain-adaptation approaches by the IR component they modify and surveys data-, model-, training-, and ranking-level strategies, as well as their multilingual applicability. Key themes include query generation, contrastive learning, knowledge distillation, cross-encoder vs bi-encoder tradeoffs, and lexical-aware ranking techniques, with attention to practical costs. The work also surveys multilingual dense retrieval approaches and cross-lingual transfer, highlighting how domain adaptation techniques could extend to multilingual contexts to improve cross-language information access in low-resource domains.

Abstract

This literature review gives an overview of current approaches to perform domain adaptation in a low-resource and approaches to perform multilingual semantic search in a low-resource setting. We developed a new typology to cluster domain adaptation approaches based on the part of dense textual information retrieval systems, which they adapt, focusing on how to combine them efficiently. We also explore the possibilities of combining multilingual semantic search with domain adaptation approaches for dense retrievers in a low-resource setting.

Domain Adaptation of Multilingual Semantic Search -- Literature Review

TL;DR

This paper examines how to perform domain adaptation for multilingual semantic search in low-resource settings. It introduces a systematic, component-based typology that clusters domain-adaptation approaches by the IR component they modify and surveys data-, model-, training-, and ranking-level strategies, as well as their multilingual applicability. Key themes include query generation, contrastive learning, knowledge distillation, cross-encoder vs bi-encoder tradeoffs, and lexical-aware ranking techniques, with attention to practical costs. The work also surveys multilingual dense retrieval approaches and cross-lingual transfer, highlighting how domain adaptation techniques could extend to multilingual contexts to improve cross-language information access in low-resource domains.

Abstract

This literature review gives an overview of current approaches to perform domain adaptation in a low-resource and approaches to perform multilingual semantic search in a low-resource setting. We developed a new typology to cluster domain adaptation approaches based on the part of dense textual information retrieval systems, which they adapt, focusing on how to combine them efficiently. We also explore the possibilities of combining multilingual semantic search with domain adaptation approaches for dense retrievers in a low-resource setting.
Paper Structure (20 sections, 16 equations, 2 figures, 1 table)