Table of Contents
Fetching ...

PhaseSeed: Precise Call Graph Construction for Split-Phase Applications using Dynamic Seeding

Tapti Palit, Seyedhamed Ghavamnia, Michalis Polychronakis

TL;DR

The paper tackles the inefficiency and potential unsoundness of static pointer analysis for security defenses by exploiting split-phase application architectures. It introduces PhaseSeed, which dynamically executes the initialization phase to observe precise points-to relationships and types, then seeds this information into a static analysis of the processing phase, maintaining soundness with respect to the initial runtime configuration. Key contributions include dynamic interpretation with accurate heap-type derivation, iterative code partitioning to identify processing-phase functions, and heap-object-aware seeding and cloning to preserve per-object precision, all implemented atop LLVM 12 and SVF. Empirically, PhaseSeed yields up to 92.6% improvement in CFI precision, reduces code exposure via debloating, and extends system-call filtering capabilities, while also reducing analysis time, demonstrating practical impact for runtime security hardening.

Abstract

Precise and sound call graph construction is crucial for many software security mechanisms. Unfortunately, traditional static pointer analysis techniques used to generate application call graphs suffer from imprecision. These techniques are agnostic to the application's architecture and are designed for broad applicability. To mitigate this precision problem, we propose PhaseSeed, a novel technique that improves the accuracy of pointer analysis for split-phase applications, which have distinct initialization and processing phases. PhaseSeed analyzes the initialization phase dynamically, collecting the points-to relationships established at runtime. At the end of the initialization phase, it then seeds this information to a static analysis stage that performs pointer analysis for all code that stays in scope during the processing phase, improving precision. Our observations show that, given the same runtime configuration options, the points-to relationships established during the initialization phase remain constant across multiple runs. Therefore, PhaseSeed is sound with respect to a given initial configuration. We apply PhaseSeed to three security mechanisms: control flow integrity (CFI), software debloating, and system call filtering. PhaseSeed provides up to 92.6% precision improvement for CFI compared to static call graph construction techniques, and filters nine additional security-critical system calls when used to generate Seccomp profiles.

PhaseSeed: Precise Call Graph Construction for Split-Phase Applications using Dynamic Seeding

TL;DR

The paper tackles the inefficiency and potential unsoundness of static pointer analysis for security defenses by exploiting split-phase application architectures. It introduces PhaseSeed, which dynamically executes the initialization phase to observe precise points-to relationships and types, then seeds this information into a static analysis of the processing phase, maintaining soundness with respect to the initial runtime configuration. Key contributions include dynamic interpretation with accurate heap-type derivation, iterative code partitioning to identify processing-phase functions, and heap-object-aware seeding and cloning to preserve per-object precision, all implemented atop LLVM 12 and SVF. Empirically, PhaseSeed yields up to 92.6% improvement in CFI precision, reduces code exposure via debloating, and extends system-call filtering capabilities, while also reducing analysis time, demonstrating practical impact for runtime security hardening.

Abstract

Precise and sound call graph construction is crucial for many software security mechanisms. Unfortunately, traditional static pointer analysis techniques used to generate application call graphs suffer from imprecision. These techniques are agnostic to the application's architecture and are designed for broad applicability. To mitigate this precision problem, we propose PhaseSeed, a novel technique that improves the accuracy of pointer analysis for split-phase applications, which have distinct initialization and processing phases. PhaseSeed analyzes the initialization phase dynamically, collecting the points-to relationships established at runtime. At the end of the initialization phase, it then seeds this information to a static analysis stage that performs pointer analysis for all code that stays in scope during the processing phase, improving precision. Our observations show that, given the same runtime configuration options, the points-to relationships established during the initialization phase remain constant across multiple runs. Therefore, PhaseSeed is sound with respect to a given initial configuration. We apply PhaseSeed to three security mechanisms: control flow integrity (CFI), software debloating, and system call filtering. PhaseSeed provides up to 92.6% precision improvement for CFI compared to static call graph construction techniques, and filters nine additional security-critical system calls when used to generate Seccomp profiles.

Paper Structure

This paper contains 31 sections, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: Type-punning in the code of Lighttpd to implement inheritance and polymorphism.
  • Figure 2: A fully static analysis approach compared to PhaseSeed. The start_processing annotation indicates the transition between the initialization and processing phase for PhaseSeed. Colored boxes show the statements operated on by any given stage. The fully static approach derives multiple spurious points-to relationships due to imprecision, while PhaseSeed's dynamic seeded approach can precisely derive the points-to relationships for all pointers in the sample code.
  • Figure 3: PhaseSeed pipeline stages.
  • Figure 4: MbedTLS example illustrating difficulty in statically determining heap object types.
  • Figure 5: Lighttpd example illustrating a cast up from a more expressive type to a less expressive type.
  • ...and 7 more figures