Table of Contents
Fetching ...

Domainator: Detecting and Identifying DNS-Tunneling Malware Using Metadata Sequences

Denis Petrov, Pascal Ruffing, Sebastian Zillien, Steffen Wendzel

TL;DR

Domainator presents a sequence-based approach to detect and identify DNS-tunneling malware using statistical features derived solely from DNS subdomain metadata. By extracting cross-request similarity metrics and feeding them into Random Forest classifiers, the method detects malicious DNS traffic (F1 ≈ 0.966) and identifies the responsible malware (F1 ≈ 0.857) as well as the executed actions (e.g., upload/download) with notable accuracy. The framework is validated on seven real-world malware samples and open-source tools, augmented with legitimate DNS traffic, and organized around 43,212 feature-rich windows. A public dataset accompanies the study, and limitations include idle-pattern confusion and the need to retrain for new malware variants, with future work aimed at extending beyond DNS to other protocols like HTTP(S).

Abstract

In recent years, malware with tunneling (or: covert channel) capabilities is on the rise. While malware research led to several methods and innovations, the detection and differentiation of malware solely based on its DNS tunneling features is still in its infancy. Moreover, no work so far has used the DNS tunneling traffic to gain knowledge over the current actions taken by the malware. In this paper, we present Domainator, an approach to detect and differentiate state-of-the-art malware and DNS tunneling tools without relying on trivial (but quickly altered) features such as "magic bytes" that are embedded into subdomains. Instead, we apply an analysis of sequential patterns to identify specific types of malware. We evaluate our approach with 7 different malware samples and tunneling tools and can identify the particular malware based on its DNS traffic. We further infer the rough behavior of the particular malware through its DNS tunneling artifacts. Finally, we compare our Domainator with related methods.

Domainator: Detecting and Identifying DNS-Tunneling Malware Using Metadata Sequences

TL;DR

Domainator presents a sequence-based approach to detect and identify DNS-tunneling malware using statistical features derived solely from DNS subdomain metadata. By extracting cross-request similarity metrics and feeding them into Random Forest classifiers, the method detects malicious DNS traffic (F1 ≈ 0.966) and identifies the responsible malware (F1 ≈ 0.857) as well as the executed actions (e.g., upload/download) with notable accuracy. The framework is validated on seven real-world malware samples and open-source tools, augmented with legitimate DNS traffic, and organized around 43,212 feature-rich windows. A public dataset accompanies the study, and limitations include idle-pattern confusion and the need to retrain for new malware variants, with future work aimed at extending beyond DNS to other protocols like HTTP(S).

Abstract

In recent years, malware with tunneling (or: covert channel) capabilities is on the rise. While malware research led to several methods and innovations, the detection and differentiation of malware solely based on its DNS tunneling features is still in its infancy. Moreover, no work so far has used the DNS tunneling traffic to gain knowledge over the current actions taken by the malware. In this paper, we present Domainator, an approach to detect and differentiate state-of-the-art malware and DNS tunneling tools without relying on trivial (but quickly altered) features such as "magic bytes" that are embedded into subdomains. Instead, we apply an analysis of sequential patterns to identify specific types of malware. We evaluate our approach with 7 different malware samples and tunneling tools and can identify the particular malware based on its DNS traffic. We further infer the rough behavior of the particular malware through its DNS tunneling artifacts. Finally, we compare our Domainator with related methods.

Paper Structure

This paper contains 24 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: A visualization of the malware instance communicating with its C2 server
  • Figure 2: A single window of three requests and their combination into the statistical metrics. Each metric consists of the mean value for the window.
  • Figure 3: Classification results for the malware detection
  • Figure 4: ROC curve: malware detection
  • Figure 5: Malware identification results
  • ...and 2 more figures