Validated Code Translation for Projects with External Libraries
Hanliang Zhang, Arindam Sharma, Cristina David, Meng Wang, Brandon Paulsen, Daniel Kroening, Wenjia Ye, Taro Sekiyama
TL;DR
This paper tackles translating Go projects that rely on external libraries into Rust, addressing two core challenges: hallucination of external APIs by LLMs and the difficulty of validating semantic equivalence when library-defined types are opaque. It proposes a modular, retrieval-guided pipeline (CrossCrate) that first maps Go APIs to Rust crates using crate-scoped knowledge bases and import-resolution, then generates translations with API-informed prompts. For validation, the framework introduces a schema-backed Protobuf carrier ($T_{ ext{Proto}}$) to bridge Go and Rust, synthesizing adapters from public APIs and performing two-stage round-tripping before differential I/O testing on unit-test inputs. Experimental results on six real Go repositories with dependencies show substantial gains, achieving up to 100% compilation and equivalence and roughly a 2x average improvement over baselines, demonstrating practical viability for repository-scale translation with external libraries.
Abstract
Large Language Models (LLMs) have shown promise for program translation, particularly for migrating systems code to memory-safe languages such as Rust. However, existing approaches struggle when source programs depend on external libraries: LLMs frequently hallucinate non-existent target APIs and fail to generate call-enabling imports; moreover, validating semantic equivalence is challenging when the code manipulates opaque, library-defined types. We present a translation and validation framework for translating Go projects with external dependencies to Rust. Our approach combines (i) a retrieval mechanism that maps Go library APIs to Rust APIs, and (ii) a cross-language validation pipeline that establishes language interoperability in the presence of opaque library types by synthesising adapters exclusively from public library APIs, prior to validating I/O equivalence. We evaluate our system on six real-world Go repositories with non-trivial external dependencies. Our approach significantly increases both the compilation and equivalence success rate (up to 100% in the most dependency-heavy case; approx. 2x on average) by enabling validated translation that manipulate opaque, library-defined types.
