Table of Contents
Fetching ...

Practical Type-Based Taint Checking and Inference (Extended Version)

Nima Karimipour, Kanak Das, Manu Sridharan, Behnaz Hassanshahi

TL;DR

This work tackles taint-tracking for security by addressing scalability and practicality gaps in static taint analysis. It introduces TaintTyper, a type-based taint checker that is designed to be modular, incremental, and capable of handling unannotated code via polymorphic defaulting, while also inferring taint type qualifiers for existing code, including generic type arguments and @PolyTaint. The approach is evaluated on both toy benchmarks and real-world Java projects, showing higher recall with comparable precision and significant speedups relative to state-of-the-art whole-program analyzers, with ablations confirming the importance of the new checker and inference features. The results indicate that type-based taint checking can be made practical for real-world codebases, enabling fast, scalable, and maintainable taint analysis that complements or substitutes traditional interprocedural approaches.

Abstract

Many important security properties can be formulated in terms of flows of tainted data, and improved taint analysis tools to prevent such flows are of critical need. Most existing taint analyses use whole-program static analysis, leading to scalability challenges. Type-based checking is a promising alternative, as it enables modular and incremental checking for fast performance. However, type-based approaches have not been widely adopted in practice, due to challenges with false positives and annotating existing codebases. In this paper, we present a new approach to type-based checking of taint properties that addresses these challenges, based on two key techniques. First, we present a new type-based tainting checker with significantly reduced false positives, via more practical handling of third-party libraries and other language constructs. Second, we present a novel technique to automatically infer tainting type qualifiers for existing code. Our technique supports inference of generic type argument annotations, crucial for tainting properties. We implemented our techniques in a tool TaintTyper and evaluated it on real-world benchmarks. TaintTyper exceeds the recall of a state-of-the-art whole-program taint analyzer, with comparable precision, and 2.93X-22.9X faster checking time. Further, TaintTyper infers annotations comparable to those written by hand, suitable for insertion into source code. TaintTyper is a promising new approach to efficient and practical taint checking.

Practical Type-Based Taint Checking and Inference (Extended Version)

TL;DR

This work tackles taint-tracking for security by addressing scalability and practicality gaps in static taint analysis. It introduces TaintTyper, a type-based taint checker that is designed to be modular, incremental, and capable of handling unannotated code via polymorphic defaulting, while also inferring taint type qualifiers for existing code, including generic type arguments and @PolyTaint. The approach is evaluated on both toy benchmarks and real-world Java projects, showing higher recall with comparable precision and significant speedups relative to state-of-the-art whole-program analyzers, with ablations confirming the importance of the new checker and inference features. The results indicate that type-based taint checking can be made practical for real-world codebases, enabling fast, scalable, and maintainable taint analysis that complements or substitutes traditional interprocedural approaches.

Abstract

Many important security properties can be formulated in terms of flows of tainted data, and improved taint analysis tools to prevent such flows are of critical need. Most existing taint analyses use whole-program static analysis, leading to scalability challenges. Type-based checking is a promising alternative, as it enables modular and incremental checking for fast performance. However, type-based approaches have not been widely adopted in practice, due to challenges with false positives and annotating existing codebases. In this paper, we present a new approach to type-based checking of taint properties that addresses these challenges, based on two key techniques. First, we present a new type-based tainting checker with significantly reduced false positives, via more practical handling of third-party libraries and other language constructs. Second, we present a novel technique to automatically infer tainting type qualifiers for existing code. Our technique supports inference of generic type argument annotations, crucial for tainting properties. We implemented our techniques in a tool TaintTyper and evaluated it on real-world benchmarks. TaintTyper exceeds the recall of a state-of-the-art whole-program taint analyzer, with comparable precision, and 2.93X-22.9X faster checking time. Further, TaintTyper infers annotations comparable to those written by hand, suitable for insertion into source code. TaintTyper is a promising new approach to efficient and practical taint checking.

Paper Structure

This paper contains 25 sections, 1 equation, 3 figures, 1 table, 2 algorithms.

Figures (3)

  • Figure 1: Motivating example for inference. Green text indicates where annotations are inserted by TaintTyper.
  • Figure 2: High-level architecture of TaintTyper.
  • Figure 3: Logic for determining the return type qualifier for a method call, accounting for unannotated code.