Table of Contents
Fetching ...

In-Application Defense Against Evasive Web Scans through Behavioral Analysis

Behzad Ousat, Mahshad Shariatnasab, Esteban Schafir, Farhad Shirani Chaharsooghi, Amin Kharraz

TL;DR

WebGuard introduces an in-application, multi-modal forensics engine that unobtrusively monitors spatio-temporal browser events to distinguish humans from automated web scanners in real time. By combining offline attribution via HMMs and online detection via LSTMs, and leveraging clustering methods for trend discovery, WebGuard achieves fast, high-accuracy multiclass classification with minimal network overhead, notably using WebSocket communications (~46 bytes per payload) and under 10 KB/s total. The approach demonstrates substantial improvements over uni-modal methods, requiring shorter data sequences to reach high accuracy, and provides theoretical guarantees that adding modalities increases attacker costs and decreases time-to-detection. The work also validates practical deployment via a real-world testbed and discusses integration with existing defenses, moving-target CAPTCHAs, and broader security monitoring ecosystems.

Abstract

Web traffic has evolved to include both human users and automated agents, ranging from benign web crawlers to adversarial scanners such as those capable of credential stuffing, command injection, and account hijacking at the web scale. The estimated financial costs of these adversarial activities are estimated to exceed tens of billions of dollars in 2023. In this work, we introduce WebGuard, a low-overhead in-application forensics engine, to enable robust identification and monitoring of automated web scanners, and help mitigate the associated security risks. WebGuard focuses on the following design criteria: (i) integration into web applications without any changes to the underlying software components or infrastructure, (ii) minimal communication overhead, (iii) capability for real-time detection, e.g., within hundreds of milliseconds, and (iv) attribution capability to identify new behavioral patterns and detect emerging agent categories. To this end, we have equipped WebGuard with multi-modal behavioral monitoring mechanisms, such as monitoring spatio-temporal data and browser events. We also design supervised and unsupervised learning architectures for real-time detection and offline attribution of human and automated agents, respectively. Information theoretic analysis and empirical evaluations are provided to show that multi-modal data analysis, as opposed to uni-modal analysis which relies solely on mouse movement dynamics, significantly improves time-to-detection and attribution accuracy. Various numerical evaluations using real-world data collected via WebGuard are provided achieving high accuracy in hundreds of milliseconds, with a communication overhead below 10 KB per second.

In-Application Defense Against Evasive Web Scans through Behavioral Analysis

TL;DR

WebGuard introduces an in-application, multi-modal forensics engine that unobtrusively monitors spatio-temporal browser events to distinguish humans from automated web scanners in real time. By combining offline attribution via HMMs and online detection via LSTMs, and leveraging clustering methods for trend discovery, WebGuard achieves fast, high-accuracy multiclass classification with minimal network overhead, notably using WebSocket communications (~46 bytes per payload) and under 10 KB/s total. The approach demonstrates substantial improvements over uni-modal methods, requiring shorter data sequences to reach high accuracy, and provides theoretical guarantees that adding modalities increases attacker costs and decreases time-to-detection. The work also validates practical deployment via a real-world testbed and discusses integration with existing defenses, moving-target CAPTCHAs, and broader security monitoring ecosystems.

Abstract

Web traffic has evolved to include both human users and automated agents, ranging from benign web crawlers to adversarial scanners such as those capable of credential stuffing, command injection, and account hijacking at the web scale. The estimated financial costs of these adversarial activities are estimated to exceed tens of billions of dollars in 2023. In this work, we introduce WebGuard, a low-overhead in-application forensics engine, to enable robust identification and monitoring of automated web scanners, and help mitigate the associated security risks. WebGuard focuses on the following design criteria: (i) integration into web applications without any changes to the underlying software components or infrastructure, (ii) minimal communication overhead, (iii) capability for real-time detection, e.g., within hundreds of milliseconds, and (iv) attribution capability to identify new behavioral patterns and detect emerging agent categories. To this end, we have equipped WebGuard with multi-modal behavioral monitoring mechanisms, such as monitoring spatio-temporal data and browser events. We also design supervised and unsupervised learning architectures for real-time detection and offline attribution of human and automated agents, respectively. Information theoretic analysis and empirical evaluations are provided to show that multi-modal data analysis, as opposed to uni-modal analysis which relies solely on mouse movement dynamics, significantly improves time-to-detection and attribution accuracy. Various numerical evaluations using real-world data collected via WebGuard are provided achieving high accuracy in hundreds of milliseconds, with a communication overhead below 10 KB per second.

Paper Structure

This paper contains 35 sections, 2 theorems, 7 equations, 7 figures, 5 tables, 2 algorithms.

Key Result

Proposition 1

For every $\epsilon \in (0, \frac{1}{32})$, $\gamma_{\text{ps}} \in (0, \frac{1}{8})$, $s = 6k \geq 12$, and every distribution estimation procedure, there exists an $s$-state Markov chain with pseudo-spectral gap $\gamma_{\text{ps}}$ and stationary distribution $\pi$ such that the estimation proced to achieve $\ell_\infty$ distance less than $\epsilon$, where $c$ is a universal constant.

Figures (7)

  • Figure 1: Sample Trace of the Recorded Events based on a Simplified Login Scenario
  • Figure 2: Left: the actual labels of the input traces. Right: the output estimated labels using Algorithm \ref{['alg:clustering']}.
  • Figure 3: Detecting New Behavioral Trends via Clustering. Left: actual labels. Right: output estimated labels.
  • Figure 4: LSTM Accuracy of Classifying Different Agents based on Multi-Modal Artifacts
  • Figure 5: LSTM Classification Uni-Modal vs Multi-Modal
  • ...and 2 more figures

Theorems & Definitions (3)

  • Definition 1: Hidden Markov Processes
  • Proposition 1: wolfer2021statistical, Theorem 3.2
  • Lemma 1: Sampled Chernoff-Stein Lemma