Table of Contents
Fetching ...

Protect Your Score: Contact Tracing With Differential Privacy Guarantees

Rob Romijnders, Christos Louizos, Yuki M. Asano, Max Welling

TL;DR

This paper tackles privacy concerns in contact tracing by formalizing a privacy attack on the covidscore and proposing a decentralized differential privacy framework. It introduces Differentially Private Factorized Neighbors (DPFN), which uses log-normal noise on products of neighbor messages and Rényi differential privacy to provide $$(\\varepsilon,\\delta)$$-DP guarantees while maintaining utility. Evaluations on OpenABM-Covid19 and Covasim show that, at $\\varepsilon=1$, DPFN achieves substantially lower peak infection rates than traditional methods, Gibbs sampling, or per-message DP, and scales to simulations with up to $10^6$ agents. The work discusses policy-relevant implications, limitations, and directions for future research, including repeated contacts and partial adoption, and provides open-source code and a 14-day data window strategy.

Abstract

The pandemic in 2020 and 2021 had enormous economic and societal consequences, and studies show that contact tracing algorithms can be key in the early containment of the virus. While large strides have been made towards more effective contact tracing algorithms, we argue that privacy concerns currently hold deployment back. The essence of a contact tracing algorithm constitutes the communication of a risk score. Yet, it is precisely the communication and release of this score to a user that an adversary can leverage to gauge the private health status of an individual. We pinpoint a realistic attack scenario and propose a contact tracing algorithm with differential privacy guarantees against this attack. The algorithm is tested on the two most widely used agent-based COVID19 simulators and demonstrates superior performance in a wide range of settings. Especially for realistic test scenarios and while releasing each risk score with epsilon=1 differential privacy, we achieve a two to ten-fold reduction in the infection rate of the virus. To the best of our knowledge, this presents the first contact tracing algorithm with differential privacy guarantees when revealing risk scores for COVID19.

Protect Your Score: Contact Tracing With Differential Privacy Guarantees

TL;DR

This paper tackles privacy concerns in contact tracing by formalizing a privacy attack on the covidscore and proposing a decentralized differential privacy framework. It introduces Differentially Private Factorized Neighbors (DPFN), which uses log-normal noise on products of neighbor messages and Rényi differential privacy to provide -DP guarantees while maintaining utility. Evaluations on OpenABM-Covid19 and Covasim show that, at , DPFN achieves substantially lower peak infection rates than traditional methods, Gibbs sampling, or per-message DP, and scales to simulations with up to agents. The work discusses policy-relevant implications, limitations, and directions for future research, including repeated contacts and partial adoption, and provides open-source code and a 14-day data window strategy.

Abstract

The pandemic in 2020 and 2021 had enormous economic and societal consequences, and studies show that contact tracing algorithms can be key in the early containment of the virus. While large strides have been made towards more effective contact tracing algorithms, we argue that privacy concerns currently hold deployment back. The essence of a contact tracing algorithm constitutes the communication of a risk score. Yet, it is precisely the communication and release of this score to a user that an adversary can leverage to gauge the private health status of an individual. We pinpoint a realistic attack scenario and propose a contact tracing algorithm with differential privacy guarantees against this attack. The algorithm is tested on the two most widely used agent-based COVID19 simulators and demonstrates superior performance in a wide range of settings. Especially for realistic test scenarios and while releasing each risk score with epsilon=1 differential privacy, we achieve a two to ten-fold reduction in the infection rate of the virus. To the best of our knowledge, this presents the first contact tracing algorithm with differential privacy guarantees when revealing risk scores for COVID19.
Paper Structure (26 sections, 36 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 26 sections, 36 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Example of a contact graph. This user has $C_1$ contacts at five time steps in the past and $C_2$ contacts at three time steps in the past. The released covidscore is the estimate of being in state $I$ on time step t. Appendix \ref{['app:fn_traces_explicit']} generalizes the method for a general contact graph.
  • Figure 2: Showing the effect of differential privacy on the approximate inference from FN. An example user has two contacts. Both contacts have a low covidscore in the left column, while in the right column, one contact has a high covidscore. The red-shaded region indicates the 20-80 quantiles for sampling the covidscore from the DPFN mechanism of Algorithm \ref{['alg:dpfn']}. The regions overlap, which reflects privacy, but the median red line is higher in the right column, which indicates a possible infection and could inform a testing policy.
  • Figure 3: The privacy-utility trade-off for differentially private contact tracing. The y-axis indicates the Peak Infection rate, where lower is better. At $\varepsilon=1$, a common setting for differential privacy, DPFN achieves a lower peak infection rate than all other methods. OpenABM and Covasim are the two most widely used simulators for COVID19. Error bars indicate 20-80 quantiles for ten random restarts.
  • Figure 4: Recall and average precision during a simulation on OpenABM. The shaded regions indicate the 20-80 quantiles of twenty random restarts. The peak infection rate of the curve $\varepsilon=1$ is in between lower and higher privacy levels. This effect is observed as well in intermediate values for the recall and average precision during the crucial first month of the epidemic simulation. The recall and average precision diagrams only plot 30 days of simulation, which is the crucial phase for a pandemic perra2021non.
  • Figure 5: This figure explores the scenario where a user tests positive, but does not isolate. 'Loss to follow-up' is the probability that a user ignores the request to isolate after a positive test. The DPFN method achieves lower PIR than traditional contact tracing across a wide range of the 'loss to follow-up' probability.
  • ...and 2 more figures