LaTeXTrans: Structured LaTeX Translation with Multi-Agent Coordination
Ziming Zhu, Chenglong Wang, Shunjie Xing, Yifu Huo, Fengning Tian, Quan Du, Di Yang, Chunliang Zhang, Tong Xiao, Jingbo Zhu
TL;DR
LaTeXTrans addresses the challenge of translating LaTeX-formatted documents by translating at the source level with a collaborative multi-agent pipeline that preserves structure and semantics. It deploys three modules (Parser, Translation, Generator) and six specialized agents to decompose, translate, validate, summarize, and maintain terminology, then reconstructs the target-language LaTeX and compiles it to PDF. Experimental results on arXiv-derived LaTeX content show LaTeXTrans outperforms traditional MT and single-agent baselines in both translation quality and format fidelity, with notable gains in FC-score and COMETKiwi, especially when using GPT-4o as the backbone. The work provides a practical, open-source solution for high-fidelity LaTeX document translation and highlights the benefits of structured, context-aware, and terminology-consistent translation in formatted texts.
Abstract
Despite the remarkable progress of modern machine translation (MT) systems on general-domain texts, translating structured LaTeX-formatted documents remains a significant challenge. These documents typically interleave natural language with domain-specific syntax, such as mathematical equations, tables, figures, and cross-references, all of which must be accurately preserved to maintain semantic integrity and compilability. In this paper, we introduce LaTeXTrans, a collaborative multi-agent system designed to address this challenge. LaTeXTrans ensures format preservation, structural fidelity, and terminology consistency through six specialized agents: 1) a Parser that decomposes LaTeX into translation-friendly units via placeholder substitution and syntax filtering; 2) a Translator, Validator, Summarizer, and Terminology Extractor that work collaboratively to ensure context-aware, self-correcting, and terminology-consistent translations; 3) a Generator that reconstructs the translated content into well-structured LaTeX documents. Experimental results demonstrate that LaTeXTrans can outperform mainstream MT systems in both translation accuracy and structural fidelity, offering an effective and practical solution for translating LaTeX-formatted documents.The code of LaTeXTrans is available at https://github.com/NiuTrans/LaTeXTrans.
