Table of Contents
Fetching ...

Refining Translations with LLMs: A Constraint-Aware Iterative Prompting Approach

Shangfeng Chen, Xiayang Shi, Pu Li, Yinlin Li, Jingjing Liu

TL;DR

This work proposes a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy in low-resource or domain-specific contexts, particularly in low-resource scenarios.

Abstract

Large language models (LLMs) have demonstrated remarkable proficiency in machine translation (MT), even without specific training on the languages in question. However, translating rare words in low-resource or domain-specific contexts remains challenging for LLMs. To address this issue, we propose a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy. Our method first identifies these keywords and retrieves their translations from a bilingual dictionary, integrating them into the LLM's context using Retrieval-Augmented Generation (RAG). We further mitigate potential output hallucinations caused by long prompts through an iterative self-checking mechanism, where the LLM refines its translations based on lexical and semantic constraints. Experiments using Llama and Qwen as base models on the FLORES-200 and WMT datasets demonstrate significant improvements over baselines, highlighting the effectiveness of our approach in enhancing translation faithfulness and robustness, particularly in low-resource scenarios.

Refining Translations with LLMs: A Constraint-Aware Iterative Prompting Approach

TL;DR

This work proposes a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy in low-resource or domain-specific contexts, particularly in low-resource scenarios.

Abstract

Large language models (LLMs) have demonstrated remarkable proficiency in machine translation (MT), even without specific training on the languages in question. However, translating rare words in low-resource or domain-specific contexts remains challenging for LLMs. To address this issue, we propose a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy. Our method first identifies these keywords and retrieves their translations from a bilingual dictionary, integrating them into the LLM's context using Retrieval-Augmented Generation (RAG). We further mitigate potential output hallucinations caused by long prompts through an iterative self-checking mechanism, where the LLM refines its translations based on lexical and semantic constraints. Experiments using Llama and Qwen as base models on the FLORES-200 and WMT datasets demonstrate significant improvements over baselines, highlighting the effectiveness of our approach in enhancing translation faithfulness and robustness, particularly in low-resource scenarios.

Paper Structure

This paper contains 17 sections, 8 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The proposed method of translation process.
  • Figure 2: The prompt template for keywords extraction.
  • Figure 3: The prompt template for translation based on translation notes.
  • Figure 4: The prompt template for translation based on translation notes.
  • Figure 5: Comparison of BLEU scores across different word selection methods
  • ...and 2 more figures