Table of Contents
Fetching ...

A Measurement of Genuine Tor Traces for Realistic Website Fingerprinting

Rob Jansen, Ryan Wails, Aaron Johnson

TL;DR

This work addresses the gap between website fingerprinting (WF) research and real-world Tor traffic by introducing GTT23, the first dataset of labeled genuine Tor traces collected via a safety-conscious exit-relay measurement plan. It details the measurement methodology, including circuit sampling, per-circuit metadata, and encrypted exports, designed with IRB and Tor Safety Board oversight. By comparing GTT23 to 28 synthetic WF datasets, the paper demonstrates substantial realism gaps in synthetic data, particularly in traffic diversity, base-rate realism, and intra-class variance, which can bias attack evaluations. The authors argue that GTT23 enables more accurate evaluation of WF attacks and defenses for real-world scenarios and may benefit broader Tor traffic analysis research; the dataset is available on request, with discussion of limitations and avenues for future improvements such as self-supervised training approaches and extended measurement campaigns.

Abstract

Website fingerprinting (WF) is a dangerous attack on web privacy because it enables an adversary to predict the website a user is visiting, despite the use of encryption, VPNs, or anonymizing networks such as Tor. Previous WF work almost exclusively uses synthetic datasets to evaluate the performance and estimate the feasibility of WF attacks despite evidence that synthetic data misrepresents the real world. In this paper we present GTT23, the first WF dataset of genuine Tor traces, which we obtain through a large-scale measurement of the Tor network and which is intended especially for WF. It represents real Tor user behavior better than any existing WF dataset, is larger than any existing WF dataset by at least an order of magnitude, and will help ground the future study of realistic WF attacks and defenses. In a detailed evaluation, we survey 28 WF datasets published since 2008 and compare their characteristics to those of GTT23. We discover common deficiencies of synthetic datasets that make them inferior to GTT23 for drawing meaningful conclusions about the effectiveness of WF attacks directed at real Tor users. We have made GTT23 available to promote reproducible research and to help inspire new directions for future work.

A Measurement of Genuine Tor Traces for Realistic Website Fingerprinting

TL;DR

This work addresses the gap between website fingerprinting (WF) research and real-world Tor traffic by introducing GTT23, the first dataset of labeled genuine Tor traces collected via a safety-conscious exit-relay measurement plan. It details the measurement methodology, including circuit sampling, per-circuit metadata, and encrypted exports, designed with IRB and Tor Safety Board oversight. By comparing GTT23 to 28 synthetic WF datasets, the paper demonstrates substantial realism gaps in synthetic data, particularly in traffic diversity, base-rate realism, and intra-class variance, which can bias attack evaluations. The authors argue that GTT23 enables more accurate evaluation of WF attacks and defenses for real-world scenarios and may benefit broader Tor traffic analysis research; the dataset is available on request, with discussion of limitations and avenues for future improvements such as self-supervised training approaches and extended measurement campaigns.

Abstract

Website fingerprinting (WF) is a dangerous attack on web privacy because it enables an adversary to predict the website a user is visiting, despite the use of encryption, VPNs, or anonymizing networks such as Tor. Previous WF work almost exclusively uses synthetic datasets to evaluate the performance and estimate the feasibility of WF attacks despite evidence that synthetic data misrepresents the real world. In this paper we present GTT23, the first WF dataset of genuine Tor traces, which we obtain through a large-scale measurement of the Tor network and which is intended especially for WF. It represents real Tor user behavior better than any existing WF dataset, is larger than any existing WF dataset by at least an order of magnitude, and will help ground the future study of realistic WF attacks and defenses. In a detailed evaluation, we survey 28 WF datasets published since 2008 and compare their characteristics to those of GTT23. We discover common deficiencies of synthetic datasets that make them inferior to GTT23 for drawing meaningful conclusions about the effectiveness of WF attacks directed at real Tor users. We have made GTT23 available to promote reproducible research and to help inspire new directions for future work.
Paper Structure (26 sections, 1 equation, 6 figures, 3 tables)

This paper contains 26 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The daily total (bars) and weekly mean (text) number of circuits during our 13 week measurement.
  • Figure 2: The total number of GTT23 circuits by server port, with IANA-assigned service names IANA2023.
  • Figure 3: Cumulative distribution of the number of cells per circuit over subsets of GTT23 circuits.
  • Figure 4: The number of GTT23 circuits per domain; we observe a close fit to a power-law distribution.
  • Figure 5: Cumulative distribution of circuit length variation across domains with at least two GTT23 circuits.
  • ...and 1 more figures