Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

Pratik Rakesh Singh; Mohammadi Zaki; Pankaj Wasnik

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

Pratik Rakesh Singh, Mohammadi Zaki, Pankaj Wasnik

TL;DR

The paper addresses the entertainment translation problem, where accurate, engaging translations must leverage running context and stylistic cues. It proposes CASAT, a context- and style-aware translation framework that combines a Context Retrieval-Advanced RAG module with a Domain Adaptation Module to produce a time-varying prompt $p_t$ for LLM-based translation, and it segments content into adaptive sessions to capture mood and genre. The approach is language- and LLM-agnostic and relies on offline context extraction from plot summaries stored in a vector database, enabling in-context learning for high-quality translations. Experimental results on Indian-language directions (En–Hi, En–Ben, En–Tel) show consistent improvements in COMET scores and win-ratios across multiple LLMs, outperforming traditional MT baselines in entertainment-specific translation tasks. The work demonstrates practical impact for dubbing and subtitling in diverse languages and offers a path toward online, context-sensitive, culturally aware translation systems.

Abstract

We address the challenging task of neural machine translation (NMT) in the entertainment domain, where the objective is to automatically translate a given dialogue from a source language content to a target language. This task has various applications, particularly in automatic dubbing, subtitling, and other content localization tasks, enabling source content to reach a wider audience. Traditional NMT systems typically translate individual sentences in isolation, without facilitating knowledge transfer of crucial elements such as the context and style from previously encountered sentences. In this work, we emphasize the significance of these fundamental aspects in producing pertinent and captivating translations. We demonstrate their significance through several examples and propose a novel framework for entertainment translation, which, to our knowledge, is the first of its kind. Furthermore, we introduce an algorithm to estimate the context and style of the current session and use these estimations to generate a prompt that guides a Large Language Model (LLM) to generate high-quality translations. Our method is both language and LLM-agnostic, making it a general-purpose tool. We demonstrate the effectiveness of our algorithm through various numerical studies and observe significant improvement in the COMET scores over various state-of-the-art LLMs. Moreover, our proposed method consistently outperforms baseline LLMs in terms of win-ratio.

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

TL;DR

for LLM-based translation, and it segments content into adaptive sessions to capture mood and genre. The approach is language- and LLM-agnostic and relies on offline context extraction from plot summaries stored in a vector database, enabling in-context learning for high-quality translations. Experimental results on Indian-language directions (En–Hi, En–Ben, En–Tel) show consistent improvements in COMET scores and win-ratios across multiple LLMs, outperforming traditional MT baselines in entertainment-specific translation tasks. The work demonstrates practical impact for dubbing and subtitling in diverse languages and offers a path toward online, context-sensitive, culturally aware translation systems.

Abstract

Paper Structure (19 sections, 5 equations, 8 figures, 7 tables, 2 algorithms)

This paper contains 19 sections, 5 equations, 8 figures, 7 tables, 2 algorithms.

Introduction
Background and Motivation
Methodology
Problem Formulation
Adaptive Session Classification and Segmentation
Session Information Generation
Context retrieval--Advanced RAG:
Style extraction-Domain Adaptation Module:
Experiments
Experimental Settings
Can CASAT provide audience-engaging translations?
Ablation Studies
Conclusion
Appendix
Additional Ablation Experiments
...and 4 more sections

Figures (8)

Figure 1: Examples of common mistakes made by NMT systems while translating entertainment domain text.
Figure 2: A high-level overview of our proposed methodology.
Figure 3: A block diagram of the Context retriever block.
Figure 4: Domain Adaptation Module
Figure 5: A comparative analysis of the effect of various prompts on the translated text.
...and 3 more figures

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

TL;DR

Abstract

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (8)