Seqnature: Extracting Network Fingerprints from Packet Sequences
Janus Varmarken, Rahmadi Trimananda, Athina Markopoulou
TL;DR
Seqnature tackles the challenge of identifying applications and events from network traffic by unifying fingerprint extraction into a single framework that operates on packet-sequence data. It introduces a two-phase workflow—preprocessing and fingerprint refinement—that converts traffic into feature-rich TCP streams, then uses clustering to produce seqnatures representing consistently occurring packet sequences. The paper demonstrates five fingerprinting techniques drawn from data-exchange and endpoint-based perspectives, and applies them to two public datasets (FingerprinTV and PingPong) to compare prevalence and distinctiveness, including a thorough false-positive analysis. The results corroborate prior findings that endpoint information alone is often insufficient for distinguishing events on IoT devices, while also showing nuances in smart-TV app fingerprints when relying solely on endpoint data. Overall, Seqnature provides a flexible, extensible platform for evaluating and designing fingerprinting methods, with potential implications for privacy, security, and resilient network protocol design.
Abstract
This paper proposes a general network fingerprinting framework, Seqnature, that uses packet sequences as its basic data unit and that makes it simple to implement any fingerprinting technique that can be formulated as a problem of identifying packet exchanges that consistently occur when the fingerprinted event is triggered. We demonstrate the versatility of Seqnature by using it to implement five different fingerprinting techniques, as special cases of the framework, which broadly fall into two categories: (i) fingerprinting techniques that consider features of each individual packet in a packet sequence, e.g., size and direction; and (ii) fingerprinting techniques that only consider stream-wide features, specifically what Internet endpoints are contacted. We illustrate how Seqnature facilitates comparisons of the relative performance of different fingerprinting techniques by applying the five fingerprinting techniques to datasets from the literature. The results confirm findings in prior work, for example that endpoint information alone is insufficient to differentiate between individual events on Internet of Things devices, but also show that smart TV app fingerprints based exclusively on endpoint information are not as distinct as previously reported.
