Table of Contents
Fetching ...

DPMon: a Differentially-Private Query Engine for Passive Measurements

Martino Trevisan

TL;DR

DPMon tackles privacy concerns in passive network measurements by applying differential privacy to per-user query results. It implements a modular, open-source query engine that can run locally or on a Spark cluster and supports NetFlow and Tstat data via DiffPrivLib to inject controlled noise. The authors validate the approach on campus network data (≈400M flows, 135 GB), illustrating how the privacy budget $\epsilon$ shapes utility and the ability to obtain actionable network insights while protecting user privacy. This open-source, scalable framework enables safe data sharing and collaboration in network research and practice.

Abstract

Passive monitoring is a network measurement technique which analyzes the traffic carried by an operational network. It has several applications for traffic engineering, Quality of Experience monitoring and cyber security. However, it entails the processing of personal information, thus, threatening users' privacy. In this work, we propose DPMon, a tool to run privacy-preserving queries to a dataset of passive network measurements. It exploits differential privacy to perturb the output of the query to preserve users' privacy. DPMon can exploit big data infrastructures running Apache Spark and operate on different data formats. We show that DPMon allows extracting meaningful insights from the data, while at the same time controlling the amount of disclosed information.

DPMon: a Differentially-Private Query Engine for Passive Measurements

TL;DR

DPMon tackles privacy concerns in passive network measurements by applying differential privacy to per-user query results. It implements a modular, open-source query engine that can run locally or on a Spark cluster and supports NetFlow and Tstat data via DiffPrivLib to inject controlled noise. The authors validate the approach on campus network data (≈400M flows, 135 GB), illustrating how the privacy budget shapes utility and the ability to obtain actionable network insights while protecting user privacy. This open-source, scalable framework enables safe data sharing and collaboration in network research and practice.

Abstract

Passive monitoring is a network measurement technique which analyzes the traffic carried by an operational network. It has several applications for traffic engineering, Quality of Experience monitoring and cyber security. However, it entails the processing of personal information, thus, threatening users' privacy. In this work, we propose DPMon, a tool to run privacy-preserving queries to a dataset of passive network measurements. It exploits differential privacy to perturb the output of the query to preserve users' privacy. DPMon can exploit big data infrastructures running Apache Spark and operate on different data formats. We show that DPMon allows extracting meaningful insights from the data, while at the same time controlling the amount of disclosed information.

Paper Structure

This paper contains 13 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: DPMon typical deployment
  • Figure 2: DPMon query flow.
  • Figure 3: DPMon query flow.
  • Figure 4: Distribution of the results of a query computing the share of users accessing a given website with different privacy budgets $\epsilon$. The marker represents the median, the bars span from the $5^{th}$ to the $95^{th}$ percentile.
  • Figure 5: Distribution of the weekly per-user volume, separately by direction and Layer-4 protocol. We set $\epsilon=1$.