Table of Contents
Fetching ...

A Quantitative Information Flow Analysis of the Topics API

Mário S. Alvim, Natasha Fernandes, Annabelle McIver, Gabriel H. Nunes

TL;DR

This model allows a theoretical analysis of both privacy and utility aspects of the API and their trade-off, and it is shown that the Topics API does have better privacy than third-party cookies.

Abstract

Third-party cookies have been a privacy concern since cookies were first developed in the mid 1990s, but more strict cookie policies were only introduced by Internet browser vendors in the early 2010s. More recently, due to regulatory changes, browser vendors have started to completely block third-party cookies, with both Firefox and Safari already compliant. The Topics API is being proposed by Google as an additional and less intrusive source of information for interest-based advertising (IBA), following the upcoming deprecation of third-party cookies. Initial results published by Google estimate the probability of a correct re-identification of a random individual would be below 3% while still supporting IBA. In this paper, we analyze the re-identification risk for individual Internet users introduced by the Topics API from the perspective of Quantitative Information Flow (QIF), an information- and decision-theoretic framework. Our model allows a theoretical analysis of both privacy and utility aspects of the API and their trade-off, and we show that the Topics API does have better privacy than third-party cookies. We leave the utility analyses for future work.

A Quantitative Information Flow Analysis of the Topics API

TL;DR

This model allows a theoretical analysis of both privacy and utility aspects of the API and their trade-off, and it is shown that the Topics API does have better privacy than third-party cookies.

Abstract

Third-party cookies have been a privacy concern since cookies were first developed in the mid 1990s, but more strict cookie policies were only introduced by Internet browser vendors in the early 2010s. More recently, due to regulatory changes, browser vendors have started to completely block third-party cookies, with both Firefox and Safari already compliant. The Topics API is being proposed by Google as an additional and less intrusive source of information for interest-based advertising (IBA), following the upcoming deprecation of third-party cookies. Initial results published by Google estimate the probability of a correct re-identification of a random individual would be below 3% while still supporting IBA. In this paper, we analyze the re-identification risk for individual Internet users introduced by the Topics API from the perspective of Quantitative Information Flow (QIF), an information- and decision-theoretic framework. Our model allows a theoretical analysis of both privacy and utility aspects of the API and their trade-off, and we show that the Topics API does have better privacy than third-party cookies. We leave the utility analyses for future work.
Paper Structure (14 sections, 2 theorems, 5 equations, 2 figures, 1 table)

This paper contains 14 sections, 2 theorems, 5 equations, 2 figures, 1 table.

Key Result

Lemma 2

Given the adversary's uniform initial probability distribution on individuals, the deterministic channel mapping Internet users to browsing histories, and the Bayes vulnerability measure, the final vulnerability for third-party cookies is: where $M'$ is the number of contexts on the Internet that include third-party cookies, $k'$ is the number of contexts on an Internet user's browsing history th

Figures (2)

  • Figure 1: Information collected by an interest-based advertising (IBA) company.
  • Figure 2: Topics API multiplicative leakage given the size of the taxonomy for different sizes of the k-top topics list.

Theorems & Definitions (3)

  • Remark 1: Initial vulnerability
  • Lemma 2: Final vulnerability for third-party cookies
  • Lemma 3: Final vulnerability for the Topics API