MAC Address Anonymization for Crowd Counting
Jean-François Determe, Sophia Azzagnuni, François Horlin, Philippe De Doncker
TL;DR
The paper tackles privacy-preserving crowd counting using WiFi probe requests by hashing MAC addresses after prepending time-varying peppers, producing 64-bit SA identifiers that stay unlinkable across time. It presents a rigorous collision-rate analysis, deriving exact and approximate formulas with analytical error bounds and shows that with $m=2^{64}$ and up to $10^7$ MAC addresses, the expected collision rate is about $10^{-12.5}$, well below the $10^{-9}$ target, while time synchronization errors of around $10$ ms have negligible impact on counts. The contributions include a novel two-part pepper scheme, formal privacy properties (intractability and non-tracking), and a tractable mathematical framework for collision analysis, with discussion on practical validation and limitations. The outcomes enable scalable, privacy-preserving crowd counting in large events and offer techniques that could generalize to other domains requiring hashed identifiers with time-varying salts.
Abstract
Research has shown that counting WiFi packets called probe requests (PRs) implicitly provides a proxy for the number of people in an area. In this paper, we discuss a crowd counting system involving WiFi sensors detecting PRs over the air, then extracting and anonymizing their media access control (MAC) addresses using a hash-based approach. This paper discusses an anonymization procedure and shows time-synchronization inaccuracies among sensors and hashing collision rates to be low enough to prevent anonymization from interfering with counting algorithms. In particular, we derive an approximation of the collision rate of uniformly distributed identifiers, with analytical error bounds.
