TSA-WF: Exploring the Effectiveness of Time Series Analysis for Website Fingerprinting
Michael Wrana, Uzma Maroof, Diogo Barradas
TL;DR
The paper reframes website fingerprinting as a time-series matching problem and introduces TSA-WF, a pipeline that preserves packet timing and direction to enable classical time-series similarity measures for WF. It details prototype selection, multi-distance distance computation, a prediction model, and a method to untangle multi-tab traces to approximate where a monitored site was visited. On Tor traces, TSA-WF achieves 91.2% accuracy in single-tab open-world, undefended traces and can locate a monitored website within about 10k packets in 3-tab traces with 83.7% success, though it trails DL-based attacks in multi-tab contexts. Overall, the work demonstrates the viability and complementarity of time-series approaches for WF and suggests directions to integrate with deep learning for robust multi-tab analysis.
Abstract
Website fingerprinting (WF) is a technique that allows an eavesdropper to determine the website a target user is accessing by inspecting the metadata associated with the packets she exchanges via some encrypted tunnel, e.g., Tor. Recent WF attacks built using machine learning (and deep learning) process and summarize trace metadata during their feature extraction phases. This methodology leads to predictions that lack information about the instant at which a given website is detected within a (potentially large) network trace comprised of multiple sequential website accesses -- a setting known as \textit{multi-tab} WF. In this paper, we explore whether classical time series analysis techniques can be effective in the WF setting. Specifically, we introduce TSA-WF, a pipeline designed to closely preserve network traces' timing and direction characteristics, which enables the exploration of algorithms designed to measure time series similarity in the WF context. Our evaluation with Tor traces reveals that TSA-WF achieves a comparable accuracy to existing WF attacks in scenarios where website accesses can be easily singled-out from a given trace (i.e., the \textit{single-tab} WF setting), even when shielded by specially designed WF defenses. Finally, while TSA-WF did not outperform existing attacks in the multi-tab setting, we show how TSA-WF can help pinpoint the approximate instant at which a given website of interest is visited within a multi-tab trace.\footnote{This preprint has not undergone any post-submission improvements or corrections. The Version of Record of this contribution is published in the Proceedings of the 20th International Conference on Availability, Reliability and Security (ARES 2025)}
