Table of Contents
Fetching ...

RustAssure: Differential Symbolic Testing for LLM-Transpiled C-to-Rust Code

Yubo Bai, Tapti Palit

TL;DR

RustAssure addresses the risk of LLM-generated C-to-Rust translations by pairing automatic transpilation with differential symbolic testing to verify semantic parity. It uses an LLM Invoker with prompt engineering and a Semantic Similarity Checker that performs symbolic execution and graph-based comparison across languages, bridging memory-layout and type differences. On five real-world codebases and four LLMs, RustAssure achieved $89.8\%$ compilable Rust translations and $72\%$ of cases semantically equivalent, while uncovering semantic bugs not detected by some baselines. The work demonstrates both practical viability for automated safe Rust migration and the potential for cross-language symbolic verification to improve LLM reliability.

Abstract

Rust is a memory-safe programming language that significantly improves software security. Existing codebases written in unsafe memory languages, such as C, must first be transpiled to Rust to take advantage of Rust's improved safety guarantees. RustAssure presents a system that uses Large Language Models (LLMs) to automatically transpile existing C codebases to Rust. RustAssure uses prompt engineering techniques to maximize the chances of the LLM generating idiomatic and safe Rust code. Moreover, because LLMs often generate code with subtle bugs that can be missed under traditional unit or fuzz testing, RustAssure performs differential symbolic testing to establish the semantic similarity between the original C and LLM-transpiled Rust code. We evaluated RustAssure with five real-world applications and libraries, and showed that our system is able to generate compilable Rust functions for 89.8% of all C functions, of which 69.9% produced equivalent symbolic return values for both the C and Rust functions.

RustAssure: Differential Symbolic Testing for LLM-Transpiled C-to-Rust Code

TL;DR

RustAssure addresses the risk of LLM-generated C-to-Rust translations by pairing automatic transpilation with differential symbolic testing to verify semantic parity. It uses an LLM Invoker with prompt engineering and a Semantic Similarity Checker that performs symbolic execution and graph-based comparison across languages, bridging memory-layout and type differences. On five real-world codebases and four LLMs, RustAssure achieved compilable Rust translations and of cases semantically equivalent, while uncovering semantic bugs not detected by some baselines. The work demonstrates both practical viability for automated safe Rust migration and the potential for cross-language symbolic verification to improve LLM reliability.

Abstract

Rust is a memory-safe programming language that significantly improves software security. Existing codebases written in unsafe memory languages, such as C, must first be transpiled to Rust to take advantage of Rust's improved safety guarantees. RustAssure presents a system that uses Large Language Models (LLMs) to automatically transpile existing C codebases to Rust. RustAssure uses prompt engineering techniques to maximize the chances of the LLM generating idiomatic and safe Rust code. Moreover, because LLMs often generate code with subtle bugs that can be missed under traditional unit or fuzz testing, RustAssure performs differential symbolic testing to establish the semantic similarity between the original C and LLM-transpiled Rust code. We evaluated RustAssure with five real-world applications and libraries, and showed that our system is able to generate compilable Rust functions for 89.8% of all C functions, of which 69.9% produced equivalent symbolic return values for both the C and Rust functions.

Paper Structure

This paper contains 22 sections, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Bug detected by RustAssure. When executed with an invalid UTF string, the LLM-generated Rust code returns an incorrect intermediate state instead of a pointer to the start of the input string, which is the expected functionality.
  • Figure 2: RustAssure transpiler toolchain.
  • Figure 3: C and Rust symbolic return values for the u8fold function. The red boxes are language-specific artifacts added by the Rust compiler.
  • Figure 4: Differing memory layout between C and Rust. The argument input is a char* array for which the transpiled Rust code uses Vec type.
  • Figure 5: Type usage based transpilation. The GPT-4o model can correctly identify that the Vec<u8> type is more appropriate in the first case, and the String type is correct for the second case.
  • ...and 2 more figures