Table of Contents
Fetching ...

Dismantling Common Internet Services for Ad-Malware Detection

Florian Nettersheim, Stephan Arlt, Michael Rademacher

TL;DR

The paper investigates who defines ad-malware on the web by comparing threat labeling across common Internet services, including filtered DNS endpoints and VirusTotal. It extends the Kattikatti crawler framework with a Threat Intel Broker and an Ad-Malware Detector to label HTTP requests from ad-related traffic, enabling automated cross-service analysis. Results reveal substantial inconsistencies: DNS providers label only a small fraction of domains as malicious, VirusTotal flags a larger portion but with significant partner disagreement, and only a tiny share of flagged domains are actually ad-malware. The work highlights the lack of a shared definition for ad-malware and argues for standardized labeling approaches (e.g., Maat) and more nuanced detection methods to improve web safety and transparency for users and researchers alike.

Abstract

Online advertising represents a main instrument for publishers to fund content on the World Wide Web. Unfortunately, a significant number of online advertisements often accommodates potentially malicious content, such as cryptojacking hidden in web banners - even on reputable websites. In order to protect Internet users from such online threats, the thorough detection of ad-malware campaigns plays a crucial role for a safe Web. Today, common Internet services like VirusTotal can label suspicious content based on feedback from contributors and from the entire Web community. However, it is open to which extent ad-malware is actually taken into account and whether the results of these services are consistent. In this pre-study, we evaluate who defines ad-malware on the Internet. In a first step, we crawl a vast set of websites and fetch all HTTP requests (particularly to online advertisements) within these websites. Then we query these requests both against popular filtered DNS providers and VirusTotal. The idea is to validate, how much content is labeled as a potential threat. The results show that up to 0.47% of the domains found during crawling are labeled as suspicious by DNS providers and up to 8.8% by VirusTotal. Moreover, only about 0.7% to 3.2% of these domains are categorized as ad-malware. The overall responses from the used Internet services paint a divergent picture: All considered services have different understandings to the definition of suspicious content. Thus, we outline potential research efforts to the automated detection of ad-malware. We further bring up the open question of a common definition of ad-malware to the Web community.

Dismantling Common Internet Services for Ad-Malware Detection

TL;DR

The paper investigates who defines ad-malware on the web by comparing threat labeling across common Internet services, including filtered DNS endpoints and VirusTotal. It extends the Kattikatti crawler framework with a Threat Intel Broker and an Ad-Malware Detector to label HTTP requests from ad-related traffic, enabling automated cross-service analysis. Results reveal substantial inconsistencies: DNS providers label only a small fraction of domains as malicious, VirusTotal flags a larger portion but with significant partner disagreement, and only a tiny share of flagged domains are actually ad-malware. The work highlights the lack of a shared definition for ad-malware and argues for standardized labeling approaches (e.g., Maat) and more nuanced detection methods to improve web safety and transparency for users and researchers alike.

Abstract

Online advertising represents a main instrument for publishers to fund content on the World Wide Web. Unfortunately, a significant number of online advertisements often accommodates potentially malicious content, such as cryptojacking hidden in web banners - even on reputable websites. In order to protect Internet users from such online threats, the thorough detection of ad-malware campaigns plays a crucial role for a safe Web. Today, common Internet services like VirusTotal can label suspicious content based on feedback from contributors and from the entire Web community. However, it is open to which extent ad-malware is actually taken into account and whether the results of these services are consistent. In this pre-study, we evaluate who defines ad-malware on the Internet. In a first step, we crawl a vast set of websites and fetch all HTTP requests (particularly to online advertisements) within these websites. Then we query these requests both against popular filtered DNS providers and VirusTotal. The idea is to validate, how much content is labeled as a potential threat. The results show that up to 0.47% of the domains found during crawling are labeled as suspicious by DNS providers and up to 8.8% by VirusTotal. Moreover, only about 0.7% to 3.2% of these domains are categorized as ad-malware. The overall responses from the used Internet services paint a divergent picture: All considered services have different understandings to the definition of suspicious content. Thus, we outline potential research efforts to the automated detection of ad-malware. We further bring up the open question of a common definition of ad-malware to the Web community.
Paper Structure (5 sections, 4 figures)

This paper contains 5 sections, 4 figures.

Figures (4)

  • Figure 1: Overview of our current approach to ad-malware detection.
  • Figure 2: Blocked domains from different DNS providers.
  • Figure 3: Blocked domains by DNS provider filtered by online advertisements.
  • Figure 4: Opinion of different VT partners if a certain domain is considered harmless or a potential threat (malicious and suspicious).