Conditioning LLMs with Emotion in Neural Machine Translation
Charles Brazier, Jean-Luc Rouas
TL;DR
This work tackles improving MT quality by conditioning LLM-based translation on emotion cues extracted from Speech Emotion Recognition. It first fine-tunes five open-source 7B LLMs on Libri-trans to establish a strong baseline, identifying TowerBase-7B-v0.1 as particularly effective. The best-performing model is then retrained with emotion-augmented prompts that inject arousal, dominance, or valence, using three prompt templates; evaluation with BLEU and COMET shows that arousal-driven prompts yield the most consistent gains, especially in COMET. The study demonstrates the potential of emotion-conditioned prompting to enhance MT and suggests future work on broader multilingual and speech-to-text translation tasks.
Abstract
Large Language Models (LLMs) have shown remarkable performance in Natural Language Processing tasks, including Machine Translation (MT). In this work, we propose a novel MT pipeline that integrates emotion information extracted from a Speech Emotion Recognition (SER) model into LLMs to enhance translation quality. We first fine-tune five existing LLMs on the Libri-trans dataset and select the most performant model. Subsequently, we augment LLM prompts with different dimensional emotions and train the selected LLM under these different configurations. Our experiments reveal that integrating emotion information, especially arousal, into LLM prompts leads to notable improvements in translation quality.
