Table of Contents
Fetching ...

Cryptic Bytes: WebAssembly Obfuscation for Evading Cryptojacking Detection

Håkon Harnes, Donn Morrison

TL;DR

This work provides the most extensive evaluation to date of code obfuscation techniques for WebAssembly, examining their effectiveness, detectability, and overhead across multiple abstraction levels. By introducing emcc-obf and benchmarking Tigress and wasm-mutate on a dataset of over $2.0\times10^4$ obfuscated binaries, the study demonstrates that obfuscation can significantly distort WebAssembly binaries and, in many cases, evade state-of-the-art cryptojacking detectors. Key findings show Tigress as the most effective obfuscator in terms of producing dissimilar binaries and increasing native-code size, though detectors can still be thwarted with carefully chosen transformations and stacking, at the cost of notable overheads. The work provides a valuable resource, including a large obfuscated-wasm dataset and the emcc-obf tool, to spur further research into robust detection methods and more resilient defense strategies against WebAssembly-based cryptojacking. The results underscore a practical trade-off between evasion capability and performance/size penalties, informing both defenders and researchers about realistic threat models and detection gaps.

Abstract

WebAssembly has gained significant traction as a high-performance, secure, and portable compilation target for the Web and beyond. However, its growing adoption has also introduced new security challenges. One such threat is cryptojacking, where websites mine cryptocurrencies on visitors' devices without their knowledge or consent, often through the use of WebAssembly. While detection methods have been proposed, research on circumventing them remains limited. In this paper, we present the most comprehensive evaluation of code obfuscation techniques for WebAssembly to date, assessing their effectiveness, detectability, and overhead across multiple abstraction levels. We obfuscate a diverse set of applications, including utilities, games, and crypto miners, using state-of-the-art obfuscation tools like Tigress and wasm-mutate, as well as our novel tool, emcc-obf. Our findings suggest that obfuscation can effectively produce dissimilar WebAssembly binaries, with Tigress proving most effective, followed by emcc-obf and wasm-mutate. The impact on the resulting native code is also significant, although the V8 engine's TurboFan optimizer can reduce native code size by 30\% on average. Notably, we find that obfuscation can successfully evade state-of-the-art cryptojacking detectors. Although obfuscation can introduce substantial performance overheads, we demonstrate how obfuscation can be used for evading detection with minimal overhead in real-world scenarios by strategically applying transformations. These insights are valuable for researchers, providing a foundation for developing more robust detection methods. Additionally, we make our dataset of over 20,000 obfuscated WebAssembly binaries and the emcc-obf tool publicly available to stimulate further research.

Cryptic Bytes: WebAssembly Obfuscation for Evading Cryptojacking Detection

TL;DR

This work provides the most extensive evaluation to date of code obfuscation techniques for WebAssembly, examining their effectiveness, detectability, and overhead across multiple abstraction levels. By introducing emcc-obf and benchmarking Tigress and wasm-mutate on a dataset of over obfuscated binaries, the study demonstrates that obfuscation can significantly distort WebAssembly binaries and, in many cases, evade state-of-the-art cryptojacking detectors. Key findings show Tigress as the most effective obfuscator in terms of producing dissimilar binaries and increasing native-code size, though detectors can still be thwarted with carefully chosen transformations and stacking, at the cost of notable overheads. The work provides a valuable resource, including a large obfuscated-wasm dataset and the emcc-obf tool, to spur further research into robust detection methods and more resilient defense strategies against WebAssembly-based cryptojacking. The results underscore a practical trade-off between evasion capability and performance/size penalties, informing both defenders and researchers about realistic threat models and detection gaps.

Abstract

WebAssembly has gained significant traction as a high-performance, secure, and portable compilation target for the Web and beyond. However, its growing adoption has also introduced new security challenges. One such threat is cryptojacking, where websites mine cryptocurrencies on visitors' devices without their knowledge or consent, often through the use of WebAssembly. While detection methods have been proposed, research on circumventing them remains limited. In this paper, we present the most comprehensive evaluation of code obfuscation techniques for WebAssembly to date, assessing their effectiveness, detectability, and overhead across multiple abstraction levels. We obfuscate a diverse set of applications, including utilities, games, and crypto miners, using state-of-the-art obfuscation tools like Tigress and wasm-mutate, as well as our novel tool, emcc-obf. Our findings suggest that obfuscation can effectively produce dissimilar WebAssembly binaries, with Tigress proving most effective, followed by emcc-obf and wasm-mutate. The impact on the resulting native code is also significant, although the V8 engine's TurboFan optimizer can reduce native code size by 30\% on average. Notably, we find that obfuscation can successfully evade state-of-the-art cryptojacking detectors. Although obfuscation can introduce substantial performance overheads, we demonstrate how obfuscation can be used for evading detection with minimal overhead in real-world scenarios by strategically applying transformations. These insights are valuable for researchers, providing a foundation for developing more robust detection methods. Additionally, we make our dataset of over 20,000 obfuscated WebAssembly binaries and the emcc-obf tool publicly available to stimulate further research.
Paper Structure (48 sections, 7 equations, 15 figures, 3 tables)

This paper contains 48 sections, 7 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: WebAssembly serves as an intermediate bytecode, bridging the gap between multiple source languages and host environments. The host environments compile the WebAssembly binaries into native code for underlying hardware architectures.
  • Figure 2: Cryptojacking process: The mining script is fetched from the web server, which instantiates the web workers and connects to the WebSocket proxy server. The proxy server relays the communication back to the mining pool.
  • Figure 3: Overview of MINOS: The WebAssembly binary is converted to a grayscale image and fed to the MINOS network. The network predicts whether the binary is benign or malicious.
  • Figure 4: Overview of WASim: Features are extracted from the WebAssembly binaries and fed into a classifier. The classifier model is either: (a) Neural, (b) SVM, (c) Random forest, or (d) Naive Bayes. The classifier outputs a usage report containing the predictions.
  • Figure 5: Overview of MinerRay: The WebAssembly binary is converted into a custom intermediate language, from which an interprocedural CFG is constructed. The control flow is then analyzed to detect cryptojacking, as well as for checking user consent.
  • ...and 10 more figures

Theorems & Definitions (9)

  • Definition 1: Transformation
  • Definition 2: Distance
  • Definition 3: Native code size increase
  • Definition 4: Precision
  • Definition 5: Recall
  • Definition 6: F$_1$ score
  • Definition 7: File size increase
  • Definition 8: Hash rate
  • Definition 9: Relative hash rate