Table of Contents
Fetching ...

AddressWatcher: Sanitizer-Based Localization of Memory Leak Fixes

Aniruddhan Murali, Mahmoud Alfadel, Meiyappan Nagappan, Meng Xu, Chengnian Sun

TL;DR

Addressing memory leaks in C/C++, the paper introduces AddressWatcher, a sanitizer-based dynamic framework that tracks the semantics of leaked memory across multiple execution paths via a leak database. It combines dynamic instrumentation with shadow-memory tagging to identify last-use points and propose fix locations, aiming to overcome limitations of static path-exhaustive methods and single-path dynamic approaches. Evaluation on 50 real leaks across five open-source projects shows 23 fixes (46%), complemented by 25 pull requests across 12 repositories (21 merged), including a fix that enabled a Calc version release. The work demonstrates a practical, open-source workflow for collaboratively fixing memory leaks and highlights areas for future improvement, such as fuzzing to improve test coverage and handling of error paths and multi-threaded scenarios.

Abstract

Memory leak bugs are a major problem in C/C++ programs. They occur when memory objects are not deallocated.Developers need to manually deallocate these objects to prevent memory leaks. As such, several techniques have been proposed to automatically fix memory leaks. Although proposed approaches have merit in automatically fixing memory leaks, they present limitations. Static-based approaches attempt to trace the complete semantics of memory object across all paths. However, they have scalability-related challenges when the target program has a large number of leaked paths. On the other hand, dynamic approaches can spell out precise semantics of memory object only on a single execution path (not considering multiple execution paths). In this paper, we complement prior approaches by designing and implementing a novel framework named AddressWatcher. AddressWatcher allows the semantics of a memory object to be tracked on multiple execution paths as a dynamic approach. Addresswatcher accomplishes this by using a leak database that is designed to allow storing and comparing different execution paths of a leak over several test cases. We conduct an evaluation of AddressWatcher on a benchmark of five open-source packages, namely binutils, openssh, tmux, openssl and git. In 23 out of the 50 examined memory leak bugs, AddressWatcher correctly points to a free location to fix memory leaks. Moreover, we submitted 25 new pull requests (PRs) to 12 popular open-source project repositories. These PRs targeted the resolution of memory leaks within these repositories. Among these, 21 PRs were merged, addressing 5 open GitHub issues. In fact, a critical fix prompted a new version release for the calc repository, a program used to find large primes. Furthermore, our contributions through these PRs sparked intense discussions and appreciation in various repositories such as coturn, h2o, and radare2.

AddressWatcher: Sanitizer-Based Localization of Memory Leak Fixes

TL;DR

Addressing memory leaks in C/C++, the paper introduces AddressWatcher, a sanitizer-based dynamic framework that tracks the semantics of leaked memory across multiple execution paths via a leak database. It combines dynamic instrumentation with shadow-memory tagging to identify last-use points and propose fix locations, aiming to overcome limitations of static path-exhaustive methods and single-path dynamic approaches. Evaluation on 50 real leaks across five open-source projects shows 23 fixes (46%), complemented by 25 pull requests across 12 repositories (21 merged), including a fix that enabled a Calc version release. The work demonstrates a practical, open-source workflow for collaboratively fixing memory leaks and highlights areas for future improvement, such as fuzzing to improve test coverage and handling of error paths and multi-threaded scenarios.

Abstract

Memory leak bugs are a major problem in C/C++ programs. They occur when memory objects are not deallocated.Developers need to manually deallocate these objects to prevent memory leaks. As such, several techniques have been proposed to automatically fix memory leaks. Although proposed approaches have merit in automatically fixing memory leaks, they present limitations. Static-based approaches attempt to trace the complete semantics of memory object across all paths. However, they have scalability-related challenges when the target program has a large number of leaked paths. On the other hand, dynamic approaches can spell out precise semantics of memory object only on a single execution path (not considering multiple execution paths). In this paper, we complement prior approaches by designing and implementing a novel framework named AddressWatcher. AddressWatcher allows the semantics of a memory object to be tracked on multiple execution paths as a dynamic approach. Addresswatcher accomplishes this by using a leak database that is designed to allow storing and comparing different execution paths of a leak over several test cases. We conduct an evaluation of AddressWatcher on a benchmark of five open-source packages, namely binutils, openssh, tmux, openssl and git. In 23 out of the 50 examined memory leak bugs, AddressWatcher correctly points to a free location to fix memory leaks. Moreover, we submitted 25 new pull requests (PRs) to 12 popular open-source project repositories. These PRs targeted the resolution of memory leaks within these repositories. Among these, 21 PRs were merged, addressing 5 open GitHub issues. In fact, a critical fix prompted a new version release for the calc repository, a program used to find large primes. Furthermore, our contributions through these PRs sparked intense discussions and appreciation in various repositories such as coturn, h2o, and radare2.
Paper Structure (16 sections, 2 figures, 4 tables)

This paper contains 16 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: An overview of our approach for suggesting a location of a memory leak fix.
  • Figure 2: Distribution of memory leak fixes by AddressWatcher (AW) and Memfix (MF).