Distinguishing Tor From Other Encrypted Network Traffic Through Character Analysis

Pitpimon Choorod; Tobias J. Bauer; Andreas Aßmuth

Distinguishing Tor From Other Encrypted Network Traffic Through Character Analysis

Pitpimon Choorod, Tobias J. Bauer, Andreas Aßmuth

TL;DR

The paper investigates whether Tor traffic can be distinguished from other encrypted traffic by analyzing hex-digit distributions in a single encrypted payload, and tests the hypothesis that multiple encryptions in Tor produce detectable patterns. Grounded in Shannon's perfect secrecy condition $Pr(C=c\,|\,M=m)=Pr(C=c)$ and adversarial indistinguishability, it compares prior hex-frequency approaches with controlled AES-based single-vs-triple-encryption experiments across CBC, CTR, and ECB modes. It finds that classifiers using 16-hex-digit frequency features achieve near-random accuracy ($\approx50\%$) for distinguishing single- vs triple-encrypted data, even in ECB mode, challenging the idea that the number of encryptions drives distinguishability. The results imply that the previously reported high Tor-vs-non-Tor discrimination cannot be attributed solely to encryption count, highlighting a need to identify the actual factors behind traffic distinguishability in practice.

Abstract

For journalists reporting from a totalitarian regime, whistleblowers and resistance fighters, the anonymous use of cloud services on the Internet can be vital for survival. The Tor network provides a free and widely used anonymization service for everyone. However, there are different approaches to distinguishing Tor from non-Tor encrypted network traffic, most recently only due to the (relative) frequencies of hex digits in a single encrypted payload packet. While conventional data traffic is usually encrypted once, but at least three times in the case of Tor due to the structure and principle of the Tor network, we have examined to what extent the number of encryptions contributes to being able to distinguish Tor from non-Tor encrypted data traffic.

Distinguishing Tor From Other Encrypted Network Traffic Through Character Analysis

TL;DR

and adversarial indistinguishability, it compares prior hex-frequency approaches with controlled AES-based single-vs-triple-encryption experiments across CBC, CTR, and ECB modes. It finds that classifiers using 16-hex-digit frequency features achieve near-random accuracy (

) for distinguishing single- vs triple-encrypted data, even in ECB mode, challenging the idea that the number of encryptions drives distinguishability. The results imply that the previously reported high Tor-vs-non-Tor discrimination cannot be attributed solely to encryption count, highlighting a need to identify the actual factors behind traffic distinguishability in practice.

Abstract

Paper Structure (5 sections, 2 equations, 4 figures, 1 table)

This paper contains 5 sections, 2 equations, 4 figures, 1 table.

Introduction
Related Work
Preliminary Work
New Experiments
Conclusion and Future Work

Figures (4)

Figure 1: Results of the approach proposed in Choorod2023.
Figure 2: Results with the ECB mode of operation.
Figure 3: Results with the CBC mode of operation.
Figure 4: Results with the CTR mode of operation.

Distinguishing Tor From Other Encrypted Network Traffic Through Character Analysis

TL;DR

Abstract

Distinguishing Tor From Other Encrypted Network Traffic Through Character Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (4)