Table of Contents
Fetching ...

Adversarial Agent Collaboration for C to Rust Translation

Tianyu Li, Ruishi Li, Bo Wang, Brandon Paulsen, Umang Mathur, Prateek Saxena

TL;DR

ACToR tackles memory-safety risks in legacy C by introducing a GAN-inspired two-agent system (translator and discriminator) that iteratively refines C-to-Rust translations guided by test feedback. By treating the C program as an oracle, the discriminator surfaces adversarial inputs to challenge the translator, improving generalization beyond small test suites. Across micro and macro benchmarks totaling 63 programs (~473 LoC median), ACToR achieves over 90% test pass rates with zero human intervention and outperforms non-adversarial baselines by up to 25.1%, demonstrating scalability to large real-world codebases. The approach offers a practical path to automatically convert legacy C code into memory-safe Rust with high correctness and reproducibility.

Abstract

Translating C to memory-safe languages, like Rust, prevents critical memory safety vulnerabilities that are prevalent in legacy C software. Existing approaches for C to safe Rust translation, including LLM-assisted ones, do not generalize on larger (> 500 LoC) C codebases because they depend on complex program analyses that frequently break. In this work, we present ACToR (Adversarial C To Rust translator), a simple LLM agent-based approach. Inspired by GANs, ACToR pits a generator agent against a discriminator agent, which collaborate to iteratively generate a Rust translation. On each iteration, the translator agent synthesizes and refines a Rust translation to pass an existing suite of tests, and then the discriminator agent finds new failing tests. We demonstrate that ACToR translates all of the 63 real-world command-line utilities considered in our benchmarks, which have an average size of 473 lines of code, and it achieves over 90% test pass rate with zero human intervention during translation. To our knowledge, it is the first work to show evidence that an agent-centric approach can reliably and automatically convert standalone command-line C programs at this scale. Furthermore, ACToR improves translation correctness by up to 25.1% compared to baseline, non-adversarial approaches.

Adversarial Agent Collaboration for C to Rust Translation

TL;DR

ACToR tackles memory-safety risks in legacy C by introducing a GAN-inspired two-agent system (translator and discriminator) that iteratively refines C-to-Rust translations guided by test feedback. By treating the C program as an oracle, the discriminator surfaces adversarial inputs to challenge the translator, improving generalization beyond small test suites. Across micro and macro benchmarks totaling 63 programs (~473 LoC median), ACToR achieves over 90% test pass rates with zero human intervention and outperforms non-adversarial baselines by up to 25.1%, demonstrating scalability to large real-world codebases. The approach offers a practical path to automatically convert legacy C code into memory-safe Rust with high correctness and reproducibility.

Abstract

Translating C to memory-safe languages, like Rust, prevents critical memory safety vulnerabilities that are prevalent in legacy C software. Existing approaches for C to safe Rust translation, including LLM-assisted ones, do not generalize on larger (> 500 LoC) C codebases because they depend on complex program analyses that frequently break. In this work, we present ACToR (Adversarial C To Rust translator), a simple LLM agent-based approach. Inspired by GANs, ACToR pits a generator agent against a discriminator agent, which collaborate to iteratively generate a Rust translation. On each iteration, the translator agent synthesizes and refines a Rust translation to pass an existing suite of tests, and then the discriminator agent finds new failing tests. We demonstrate that ACToR translates all of the 63 real-world command-line utilities considered in our benchmarks, which have an average size of 473 lines of code, and it achieves over 90% test pass rate with zero human intervention during translation. To our knowledge, it is the first work to show evidence that an agent-centric approach can reliably and automatically convert standalone command-line C programs at this scale. Furthermore, ACToR improves translation correctness by up to 25.1% compared to baseline, non-adversarial approaches.

Paper Structure

This paper contains 15 sections, 10 figures, 1 table, 1 algorithm.

Figures (10)

  • Figure 1: High-level overview of ACToR. The Translator and the Discriminator agents update the translation and the tests in turn to iteratively improve the correctness of the translated program.
  • Figure 2: Overall correctness (pass rate) achieved by ACToR on micro benchmark across different settings compared with naive baseline, on 5 different agent-model settings.
  • Figure 3: The relative comparison among 3 translation methods on Claude Code with Claude-Sonnet-4.5. Entry $(\text{row}, \text{column})$ is the relative pass rate of the row's translation on the column method’s tests.
  • Figure 4: The validation pass rate of ACToR on different configurations. Pass rate versus the number of new test cases added per iteration (Left). Pass rate versus the number of iterations (Right).
  • Figure 5: The relative pass rate when cross-comparing ACToR and coverage baseline (Cov-Base) on Claude Code with Claude-Sonnet-4.5 at iteration 10. For each program, the left bar shows evaluating the translation from ACToR on tests generated in coverage-baseline; the right bar is evaluating the translation from coverage-baseline on ACToR's tests. The length of each program in LoC is presented next to the program name.
  • ...and 5 more figures