Table of Contents
Fetching ...

Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models

Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng

TL;DR

This work tackles the latency–quality tradeoff in simultaneous machine translation (SiMT) by decoupling policy decision from translation generation. It introduces Agent-SiMT, a dual-agent framework where a policy-decision agent (based on a Transformer SiMT model) collaborates with a translation agent (an LLM) through a memory that streams source and translation data. A boundary-constrained word-level policy bridges token-level decisions to the LLM’s vocabulary, and supervised fine-tuning with LoRA enhances the translation agent’s performance using a small full-sentence parallel corpus. Empirical results on WMT15 De→En, MuST-C En→De, and Zh→En show state-of-the-art results and good generalization, with acceptable inference latency given modern hardware. The approach demonstrates that splitting responsibilities between specialized agents yields substantial gains in translation quality and reduces hallucinations, offering a practical pathway for deploying high-quality SiMT systems.

Abstract

Simultaneous Machine Translation (SiMT) generates target translations while reading the source sentence. It relies on a policy to determine the optimal timing for reading sentences and generating translations. Existing SiMT methods generally adopt the traditional Transformer architecture, which concurrently determines the policy and generates translations. While they excel at determining policies, their translation performance is suboptimal. Conversely, Large Language Models (LLMs), trained on extensive corpora, possess superior generation capabilities, but it is difficult for them to acquire translation policy through the training methods of SiMT. Therefore, we introduce Agent-SiMT, a framework combining the strengths of LLMs and traditional SiMT methods. Agent-SiMT contains the policy-decision agent and the translation agent. The policy-decision agent is managed by a SiMT model, which determines the translation policy using partial source sentence and translation. The translation agent, leveraging an LLM, generates translation based on the partial source sentence. The two agents collaborate to accomplish SiMT. Experiments demonstrate that Agent-SiMT attains state-of-the-art performance.

Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models

TL;DR

This work tackles the latency–quality tradeoff in simultaneous machine translation (SiMT) by decoupling policy decision from translation generation. It introduces Agent-SiMT, a dual-agent framework where a policy-decision agent (based on a Transformer SiMT model) collaborates with a translation agent (an LLM) through a memory that streams source and translation data. A boundary-constrained word-level policy bridges token-level decisions to the LLM’s vocabulary, and supervised fine-tuning with LoRA enhances the translation agent’s performance using a small full-sentence parallel corpus. Empirical results on WMT15 De→En, MuST-C En→De, and Zh→En show state-of-the-art results and good generalization, with acceptable inference latency given modern hardware. The approach demonstrates that splitting responsibilities between specialized agents yields substantial gains in translation quality and reduces hallucinations, offering a practical pathway for deploying high-quality SiMT systems.

Abstract

Simultaneous Machine Translation (SiMT) generates target translations while reading the source sentence. It relies on a policy to determine the optimal timing for reading sentences and generating translations. Existing SiMT methods generally adopt the traditional Transformer architecture, which concurrently determines the policy and generates translations. While they excel at determining policies, their translation performance is suboptimal. Conversely, Large Language Models (LLMs), trained on extensive corpora, possess superior generation capabilities, but it is difficult for them to acquire translation policy through the training methods of SiMT. Therefore, we introduce Agent-SiMT, a framework combining the strengths of LLMs and traditional SiMT methods. Agent-SiMT contains the policy-decision agent and the translation agent. The policy-decision agent is managed by a SiMT model, which determines the translation policy using partial source sentence and translation. The translation agent, leveraging an LLM, generates translation based on the partial source sentence. The two agents collaborate to accomplish SiMT. Experiments demonstrate that Agent-SiMT attains state-of-the-art performance.
Paper Structure (20 sections, 7 equations, 8 figures, 7 tables)

This paper contains 20 sections, 7 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: The framework of Agent-SiMT. The numbers in the diagram signify the execution sequence of Agent-SiMT. The red lines denote operations performed when the policy-decision agent determines to read source words. The blue lines indicate operations carried out when the decision for generation is made. The black line denotes the operation shared between both decision types.
  • Figure 2: The illustration of incorporating boundary restrictions to word-level policy. The hyperparameters $B$ and $T$ in the figure are set to 1 and 3, respectively. In the absence of boundary restrictions, the word-level policy generates $y_1$ after reading $x_4$. However, our approach modifies it to generate $y_1$ upon reading $x_3$.
  • Figure 3: Performance of different SiMT methods on De$\rightarrow$En and En$\rightarrow$De tasks.
  • Figure 4: Performance of our method on Zh$\rightarrow$En task when using different open-source translation LLMs as the translation agent.
  • Figure 5: The impact of different quantities of SFT training data on Agent-Wait-$k$+SFT. The experiments are on De$\rightarrow$En task.
  • ...and 3 more figures