Table of Contents
Fetching ...

Obfuscation as Instruction Decorrelation

Ali Ajorian, Erick Lavoie, Christian Tschudin

TL;DR

A formal definition of instruction independence with multiple instantiations for various aspects of programs; a combination of program transformations that meet the corresponding instances of instruction independence against an honest-but-curious adversary; and an implementation of an interpreter that uses a trusted execution environment only to perform memory address translation and memory shuffling.

Abstract

Obfuscation of computer programs has historically been approached either as a practical but \textit{ad hoc} craft to make reverse engineering subjectively difficult, or as a sound theoretical investigation unfortunately detached from the numerous existing constraints of engineering practical systems. In this paper, we propose \textit{instruction decorrelation} as a new approach that makes the instructions of a set of real-world programs appear independent from one another. We contribute: a formal definition of \textit{instruction independence} with multiple instantiations for various aspects of programs; a combination of program transformations that meet the corresponding instances of instruction independence against an honest-but-curious adversary, specifically random interleaving and memory access obfuscation; and an implementation of an interpreter that uses a trusted execution environment (TEE) only to perform memory address translation and memory shuffling, leaving instructions execution outside the TEE. These first steps highlight the practicality of our approach. Combined with additional techniques to protect the content of memory and to hopefully lower the requirements on TEEs, this work could potentially lead to more secure obfuscation techniques that could execute on commonly available hardware.

Obfuscation as Instruction Decorrelation

TL;DR

A formal definition of instruction independence with multiple instantiations for various aspects of programs; a combination of program transformations that meet the corresponding instances of instruction independence against an honest-but-curious adversary; and an implementation of an interpreter that uses a trusted execution environment only to perform memory address translation and memory shuffling.

Abstract

Obfuscation of computer programs has historically been approached either as a practical but \textit{ad hoc} craft to make reverse engineering subjectively difficult, or as a sound theoretical investigation unfortunately detached from the numerous existing constraints of engineering practical systems. In this paper, we propose \textit{instruction decorrelation} as a new approach that makes the instructions of a set of real-world programs appear independent from one another. We contribute: a formal definition of \textit{instruction independence} with multiple instantiations for various aspects of programs; a combination of program transformations that meet the corresponding instances of instruction independence against an honest-but-curious adversary, specifically random interleaving and memory access obfuscation; and an implementation of an interpreter that uses a trusted execution environment (TEE) only to perform memory address translation and memory shuffling, leaving instructions execution outside the TEE. These first steps highlight the practicality of our approach. Combined with additional techniques to protect the content of memory and to hopefully lower the requirements on TEEs, this work could potentially lead to more secure obfuscation techniques that could execute on commonly available hardware.

Paper Structure

This paper contains 29 sections, 4 theorems, 11 equations, 9 figures, 2 tables, 2 algorithms.

Key Result

theorem 1

(source hiding): Equation eq:unint implies that an adversary cannot establish a relationship between any randomly selected instruction $s \in \mathcal{O}(P)$ and its originating program $P_i \in P$ with more than a negligible advantage compared to random guessing.

Figures (9)

  • Figure 1: Architecture of our implementation
  • Figure 2: Control-Dependent vs. Control-Independent Syntaxes
  • Figure 3: Probability distribution of $P_1, P_2$
  • Figure 4: Simple loop example
  • Figure 5: Flat Memory Structure of Data Section
  • ...and 4 more figures

Theorems & Definitions (5)

  • definition 1: Instruction-Independent Obfuscation
  • theorem 1
  • theorem 2
  • theorem 3
  • theorem 4