Advancing Automated In-Isolation Validation in Repository-Level Code Translation
Kaiyao Ke, Ali Reza Ibrahimzada, Rangeet Pan, Saurabh Sinha, Reyhaneh Jabbarvand
TL;DR
TRAM addresses the challenge of validating repository-level code translations by integrating context-aware, RAG-guided type resolution with a mock-based in-isolation validation pipeline. It decomposes programs into fragments, resolves cross-language types using API documentation and usage context, and translates fragments while generating isolated Python mocks to verify functional equivalence. Empirical results on ten Java projects show TRAM achieves higher functional equivalence and broader validatable coverage than AlphaTrans and GraalVM-based approaches, with strong reliability and reduced manual intervention. The approach demonstrates LLM-agnostic generalizability and potential applicability to additional language pairs, offering a scalable, automated path toward reliable repository-level translations.
Abstract
Repository-level code translation aims to migrate entire repositories across programming languages while preserving functionality automatically. Despite advancements in repository-level code translation, validating the translations remains challenging. This paper proposes TRAM, which combines context-aware type resolution with mock-based in-isolation validation to achieve high-quality translations between programming languages. Prior to translation, TRAM retrieves API documentation and contextual code information for each variable type in the source language. It then prompts a large language model (LLM) with retrieved contextual information to resolve type mappings across languages with precise semantic interpretations. Using the automatically constructed type mapping, TRAM employs a custom serialization/deserialization workflow that automatically constructs equivalent mock objects in the target language. This enables each method fragment to be validated in isolation, without the high cost of using agents for translation validation, or the heavy manual effort required by existing approaches that rely on language interoperability. TRAM demonstrates state-of-the-art performance in Java-to-Python translation, underscoring the effectiveness of its integration of RAG-based type resolution with reliable in-isolation validation.
