Table of Contents
Fetching ...

rCanary: Detecting Memory Leaks Across Semi-automated Memory Management Boundary in Rust

Mohan Cui, Hui Xu, Hongliang Tian, Yangfan Zhou

TL;DR

rCanary tackles memory leaks that arise across the semi-automated memory management boundary in Rust by introducing a static, non-intrusive model checker built on Rust MIR. It combines a data-encoding encoder (rtoken) with a leak-free memory model expressed as SMT constraints solved by Z3, enabling precise, path-sensitive analysis while remaining scalable on real Rust ecosystems. The approach identifies two root leak patterns—orphan objects and proxy types—and demonstrates effectiveness by recalling leaks in nine benchmark crates and discovering 19 leaking crates among over 1,200 real-world projects, with an average of 8.4 seconds per crate. The work highlights the practical impact of static leak detection for Rust tooling and provides an extensible framework for future enhancements in boundary-aware memory safety, including FFI considerations and broader language support.

Abstract

Rust is an effective system programming language that guarantees memory safety via compile-time verifications. It employs a novel ownership-based resource management model to facilitate automated deallocation. This model is anticipated to eliminate memory leaks. However, we observed that user intervention drives it into semi-automated memory management and makes it error-prone to cause leaks. In contrast to violating memory-safety guarantees restricted by the unsafe keyword, the boundary of leaking memory is implicit, and the compiler would not emit any warnings for developers. In this paper, we present rCanary, a static, non-intrusive, and fully automated model checker to detect leaks across the semiautomated boundary. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability. It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We evaluate rCanary by using flawed package benchmarks collected from the pull requests of open-source Rust projects. The results indicate that it is possible to recall all these defects with acceptable false positives. We further apply our tool to more than 1,200 real-world crates from crates.io and GitHub, identifying 19 crates having memory leaks. Our analyzer is also efficient, that costs 8.4 seconds per package.

rCanary: Detecting Memory Leaks Across Semi-automated Memory Management Boundary in Rust

TL;DR

rCanary tackles memory leaks that arise across the semi-automated memory management boundary in Rust by introducing a static, non-intrusive model checker built on Rust MIR. It combines a data-encoding encoder (rtoken) with a leak-free memory model expressed as SMT constraints solved by Z3, enabling precise, path-sensitive analysis while remaining scalable on real Rust ecosystems. The approach identifies two root leak patterns—orphan objects and proxy types—and demonstrates effectiveness by recalling leaks in nine benchmark crates and discovering 19 leaking crates among over 1,200 real-world projects, with an average of 8.4 seconds per crate. The work highlights the practical impact of static leak detection for Rust tooling and provides an extensible framework for future enhancements in boundary-aware memory safety, including FFI considerations and broader language support.

Abstract

Rust is an effective system programming language that guarantees memory safety via compile-time verifications. It employs a novel ownership-based resource management model to facilitate automated deallocation. This model is anticipated to eliminate memory leaks. However, we observed that user intervention drives it into semi-automated memory management and makes it error-prone to cause leaks. In contrast to violating memory-safety guarantees restricted by the unsafe keyword, the boundary of leaking memory is implicit, and the compiler would not emit any warnings for developers. In this paper, we present rCanary, a static, non-intrusive, and fully automated model checker to detect leaks across the semiautomated boundary. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability. It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We evaluate rCanary by using flawed package benchmarks collected from the pull requests of open-source Rust projects. The results indicate that it is possible to recall all these defects with acceptable false positives. We further apply our tool to more than 1,200 real-world crates from crates.io and GitHub, identifying 19 crates having memory leaks. Our analyzer is also efficient, that costs 8.4 seconds per package.
Paper Structure (37 sections, 7 figures, 6 tables, 1 algorithm)

This paper contains 37 sections, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: The relationship between rtoken, Rust owner, and heap item. In Figure \ref{['fig:relation1']}, the rtoken holders can be objects, references, and pointers. The stack frame and heap chunks reflect the real data storage. When an object exists, the object holds the rtoken (solid), while other pointers only point to it (dotted). If the object does not exist, pointer types hold the rtoken (solid). In Figure \ref{['fig:relation2']}, the first argument of each method is wrapped with ManuallyDrop in the function body.
  • Figure 2: Motivating examples of orphan-object and proxy-type issues detected in rCanary, caused by the lack of manual deallocation towards ManuallyDrop values.
  • Figure 3: The system architecture of rCanary.
  • Figure 4: Example of the encoder, including AdtDef analysis for Vec<T,A> and type encoding for String. The types have been flattened, forming a directed acyclic graph (DAG). Results are propagated forward using reverse topological sorting.
  • Figure 5: The intra-procedural rules in the leak-free memory model.
  • ...and 2 more figures

Theorems & Definitions (10)

  • Definition 2.1: Heap Item, Rtoken
  • Definition 2.2: Orphan Object
  • Definition 2.3: Proxy Type
  • Definition 4.1: Rtoken
  • Definition 4.2: Heap-item Unit
  • Definition 4.3: Isolated Parameter
  • Definition 5.1: Extend
  • Definition 5.2: Shrink
  • Definition 5.3: Rtoken Constructor
  • Definition 5.4: Rtoken Destructor