Table of Contents
Fetching ...

Using Language Models to Disambiguate Lexical Choices in Translation

Josh Barua, Sanjay Subramanian, Kayo Yin, Alane Suhr

TL;DR

This work works with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English, and uses language models to generate English rules describing target-language concept variations.

Abstract

In translation, a concept represented by a single word in a source language can have multiple variations in a target language. The task of lexical selection requires using context to identify which variation is most appropriate for a source text. We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English. We evaluate recent LLMs and neural machine translation systems on DTAiLS, with the best-performing model, GPT-4, achieving from 67 to 85% accuracy across languages. Finally, we use language models to generate English rules describing target-language concept variations. Providing weaker models with high-quality lexical rules improves accuracy substantially, in some cases reaching or outperforming GPT-4.

Using Language Models to Disambiguate Lexical Choices in Translation

TL;DR

This work works with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English, and uses language models to generate English rules describing target-language concept variations.

Abstract

In translation, a concept represented by a single word in a source language can have multiple variations in a target language. The task of lexical selection requires using context to identify which variation is most appropriate for a source text. We work with native speakers of nine languages to create DTAiLS, a dataset of 1,377 sentence pairs that exhibit cross-lingual concept variation when translating from English. We evaluate recent LLMs and neural machine translation systems on DTAiLS, with the best-performing model, GPT-4, achieving from 67 to 85% accuracy across languages. Finally, we use language models to generate English rules describing target-language concept variations. Providing weaker models with high-quality lexical rules improves accuracy substantially, in some cases reaching or outperforming GPT-4.

Paper Structure

This paper contains 23 sections, 2 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Generated rules for English date with lexical variations khorma, rotab, and kharak in Farsi.
  • Figure 2: Comparisons between LMs with and without rules to NMT systems on lexical selection. We report $\mu_{\pm \sigma}$ across 3 runs for LM experiments.
  • Figure 3: Interface for annotating rules and extracted concepts and variations.
  • Figure 4: Interface for lexical selection task.
  • Figure 5: Percent of time each model selects an answer at each position when there are 2 lexical variations.
  • ...and 4 more figures