BabelCoder: Agentic Code Translation with Specification Alignment

Fazle Rabbi; Soumit Kanti Saha; Tri Minh Triet Pham; Song Wang; Jinqiu Yang

BabelCoder: Agentic Code Translation with Specification Alignment

Fazle Rabbi, Soumit Kanti Saha, Tri Minh Triet Pham, Song Wang, Jinqiu Yang

TL;DR

BabelCoder introduces a novel multi-agent framework for cross-language code translation that splits translation into specialized Translation, Test, and Refinement agents. It leverages NL-Specifications to semantically guide translation, combines SBFL and LLM-based scope estimation for targeted bug localization, and employs test-driven refinement to improve executability and robustness. Evaluated on Avatar, CodeNet, EvalPlus, and TransCoder across five languages, it achieves an average computational accuracy of 94.16%, outperforming four strong baselines in most cases. Ablation studies demonstrate the additive value of NL-Specification augmentation/validation and bug-scope estimation, underscoring the effectiveness of integrating specification-guided reasoning with iterative repair. The work suggests promising directions for scaling to repository-level translation and further enhancing semantic fidelity through richer control-flow information.

Abstract

As software systems evolve, developers increasingly work across multiple programming languages and often face the need to migrate code from one language to another. While automatic code translation offers a promising solution, it has long remained a challenging task. Recent advancements in Large Language Models (LLMs) have shown potential for this task, yet existing approaches remain limited in accuracy and fail to effectively leverage contextual and structural cues within the code. Prior work has explored translation and repair mechanisms, but lacks a structured, agentic framework where multiple specialized agents collaboratively improve translation quality. In this work, we introduce BabelCoder, an agentic framework that performs code translation by decomposing the task into specialized agents for translation, testing, and refinement, each responsible for a specific aspect such as generating code, validating correctness, or repairing errors. We evaluate BabelCoder on four benchmark datasets and compare it against four state-of-the-art baselines. BabelCoder outperforms existing methods by 0.5%-13.5% in 94% of cases, achieving an average accuracy of 94.16%.

BabelCoder: Agentic Code Translation with Specification Alignment

TL;DR

Abstract

BabelCoder: Agentic Code Translation with Specification Alignment

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)