A Comprehensive Survey of Website Fingerprinting Attacks and Defenses in Tor: Advances and Open Challenges
Yuwen Cui, Guangjing Wang, Khanh Vu, Kai Wei, Kehan Shen, Zhengyuan Jiang, Xiao Han, Ning Wang, Zhuo Lu, Yao Liu
TL;DR
This survey addresses the WF landscape in Tor by unifying datasets, attack methodologies, and defense strategies. It synthesizes traditional ML and deep learning WF attacks, evaluates them across closed-world and open-world settings, and surveys defenses spanning adaptive padding, regularization, morphing, and adversarial perturbations. The work identifies core challenges—data drift, multi-tab browsing, dataset realism, and deployment practicality—and outlines future directions, including data diversity, adaptive modeling, and the emerging role of LLMs in WF. By consolidating prior work and proposing a structured framework, the paper guides researchers and practitioners toward more robust and deployable privacy protections for Tor users.
Abstract
The Tor network provides users with strong anonymity by routing their internet traffic through multiple relays. While Tor encrypts traffic and hides IP addresses, it remains vulnerable to traffic analysis attacks such as the website fingerprinting (WF) attack, achieving increasingly high fingerprinting accuracy even under open-world conditions. In response, researchers have proposed a variety of defenses, ranging from adaptive padding, traffic regularization, and traffic morphing to adversarial perturbation, that seek to obfuscate or reshape traffic traces. However, these defenses often entail trade-offs between privacy, usability, and system performance. Despite extensive research, a comprehensive survey unifying WF datasets, attack methodologies, and defense strategies remains absent. This paper fills that gap by systematically categorizing existing WF research into three key domains: datasets, attack models, and defense mechanisms. We provide an in-depth comparative analysis of techniques, highlight their strengths and limitations under diverse threat models, and discuss emerging challenges such as multi-tab browsing and coarse-grained traffic features. By consolidating prior work and identifying open research directions, this survey serves as a foundation for advancing stronger privacy protection in Tor.
