Table of Contents
Fetching ...

A New Framework of Software Obfuscation Evaluation Criteria

Bjorn De Sutter

TL;DR

This paper analyzes historical criteria for software protection evaluation—potency, resilience, and stealth—showing how inconsistent definitions impede reliable assessment of obfuscation and tamperproofing. It surveys CTL’s original framework, Nagra–Collberg’s refinements, and Dalla Preda–Giacobazzi’s abstract-interpretation approaches, highlighting persistent shortcomings such as layering, attacker goals, and practical applicability. The authors propose a new framework comprising eight core criteria—relevance, effectiveness (and efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost—each with detailed subcriteria (e.g., $Re_a$, $E_o$, $Ro_o$, $C_l$) to ground evaluations in realistic attacker strategies and layered protection scenarios. The framework emphasizes evaluating actual attack steps and their interactions across varied samples and configurations, while accommodating ad hoc tool-based metrics when judiciously applied. The work aims to improve construct, internal, external, and instantiation validity, offering a pragmatic path toward standardized, repeatable SP evaluation and informing researchers and practitioners about the tradeoffs and practical impact of obfuscation techniques.

Abstract

In the domain of practical software protection against man-at-the-end attacks such as software reverse engineering and tampering, much of the scientific literature is plagued by the use of subpar methods to evaluate the protections' strength and even by the absence of such evaluations. Several criteria have been proposed in the past to assess the strength of protections, such as potency, resilience, stealth, and cost. We analyze their evolving definitions and uses. We formulate a number of critiques, from which we conclude that the existing definitions are unsatisfactory and need to be revised. We present a new framework of software protection evaluation criteria: relevance, effectiveness (or efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost.

A New Framework of Software Obfuscation Evaluation Criteria

TL;DR

This paper analyzes historical criteria for software protection evaluation—potency, resilience, and stealth—showing how inconsistent definitions impede reliable assessment of obfuscation and tamperproofing. It surveys CTL’s original framework, Nagra–Collberg’s refinements, and Dalla Preda–Giacobazzi’s abstract-interpretation approaches, highlighting persistent shortcomings such as layering, attacker goals, and practical applicability. The authors propose a new framework comprising eight core criteria—relevance, effectiveness (and efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost—each with detailed subcriteria (e.g., , , , ) to ground evaluations in realistic attacker strategies and layered protection scenarios. The framework emphasizes evaluating actual attack steps and their interactions across varied samples and configurations, while accommodating ad hoc tool-based metrics when judiciously applied. The work aims to improve construct, internal, external, and instantiation validity, offering a pragmatic path toward standardized, repeatable SP evaluation and informing researchers and practitioners about the tradeoffs and practical impact of obfuscation techniques.

Abstract

In the domain of practical software protection against man-at-the-end attacks such as software reverse engineering and tampering, much of the scientific literature is plagued by the use of subpar methods to evaluate the protections' strength and even by the absence of such evaluations. Several criteria have been proposed in the past to assess the strength of protections, such as potency, resilience, stealth, and cost. We analyze their evolving definitions and uses. We formulate a number of critiques, from which we conclude that the existing definitions are unsatisfactory and need to be revised. We present a new framework of software protection evaluation criteria: relevance, effectiveness (or efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost.

Paper Structure

This paper contains 94 sections, 1 table.