Table of Contents
Fetching ...

Privately Evaluating Untrusted Black-Box Functions

Ephraim Linder, Sofya Raskhodnikova, Adam Smith, Thomas Steinke

TL;DR

Privately Evaluating Untrusted Black-Box Functions addresses how to safely share sensitive data by allowing a curator to privately evaluate a black-box program $f$ on dataset $x$ using differential privacy wrappers, without prior knowledge of analysts' questions. The authors introduce two settings—Automated Sensitivity Detection and Claimed Sensitivity Bound—and design novel wrappers, including Sens-o-Matic and Subset-Extension, that achieve near-optimal accuracy and locality based on a down sensitivity measure $DS^f_\lambda(x)$ and the behavior of $f$ on the $\lambda$-down neighborhood $\mathcal{N}^\downarrow_\lambda(x)$. They provide matching lower bounds on locality and query complexity and show independence from the universe size in bounded-range scenarios, broadening private evaluation to general black-box functions beyond white-box analyses. Together, these results enable automated, privacy-preserving sharing of complex data analyses and have practical impact for secure data collaboration under untrusted analysts.

Abstract

We provide tools for sharing sensitive data when the data curator does not know in advance what questions an (untrusted) analyst might ask about the data. The analyst can specify a program that they want the curator to run on the dataset. We model the program as a black-box function $f$. We study differentially private algorithms, called privacy wrappers, that, given black-box access to a real-valued function $f$ and a sensitive dataset $x$, output an accurate approximation to $f(x)$. The dataset $x$ is modeled as a finite subset of a possibly infinite set $U$, in which each entry represents data of one individual. A privacy wrapper calls $f$ on the dataset $x$ and on some subsets of $x$ and returns either an approximation to $f(x)$ or a nonresponse symbol $\perp$. The wrapper may also use additional information (that is, parameters) provided by the analyst, but differential privacy is required for all values of these parameters. Correct setting of these parameters will ensure better accuracy of the wrapper. The bottleneck in the running time of our wrappers is the number of calls to $f$, which we refer to as queries. Our goal is to design wrappers with high accuracy and low query complexity. We introduce a novel setting, the automated sensitivity detection setting, where the analyst supplies the black-box function $f$ and the intended (finite) range of $f$. In the previously considered setting, the claimed sensitivity bound setting, the analyst supplies additional parameters that describe the sensitivity of $f$. We design privacy wrappers for both settings and show that our wrappers are nearly optimal in terms of accuracy, locality (i.e., the depth of the local neighborhood of the dataset $x$ they explore), and query complexity. In the claimed sensitivity bound setting, we provide the first accuracy guarantees that have no dependence on the size of the universe $U$.

Privately Evaluating Untrusted Black-Box Functions

TL;DR

Privately Evaluating Untrusted Black-Box Functions addresses how to safely share sensitive data by allowing a curator to privately evaluate a black-box program on dataset using differential privacy wrappers, without prior knowledge of analysts' questions. The authors introduce two settings—Automated Sensitivity Detection and Claimed Sensitivity Bound—and design novel wrappers, including Sens-o-Matic and Subset-Extension, that achieve near-optimal accuracy and locality based on a down sensitivity measure and the behavior of on the -down neighborhood . They provide matching lower bounds on locality and query complexity and show independence from the universe size in bounded-range scenarios, broadening private evaluation to general black-box functions beyond white-box analyses. Together, these results enable automated, privacy-preserving sharing of complex data analyses and have practical impact for secure data collaboration under untrusted analysts.

Abstract

We provide tools for sharing sensitive data when the data curator does not know in advance what questions an (untrusted) analyst might ask about the data. The analyst can specify a program that they want the curator to run on the dataset. We model the program as a black-box function . We study differentially private algorithms, called privacy wrappers, that, given black-box access to a real-valued function and a sensitive dataset , output an accurate approximation to . The dataset is modeled as a finite subset of a possibly infinite set , in which each entry represents data of one individual. A privacy wrapper calls on the dataset and on some subsets of and returns either an approximation to or a nonresponse symbol . The wrapper may also use additional information (that is, parameters) provided by the analyst, but differential privacy is required for all values of these parameters. Correct setting of these parameters will ensure better accuracy of the wrapper. The bottleneck in the running time of our wrappers is the number of calls to , which we refer to as queries. Our goal is to design wrappers with high accuracy and low query complexity. We introduce a novel setting, the automated sensitivity detection setting, where the analyst supplies the black-box function and the intended (finite) range of . In the previously considered setting, the claimed sensitivity bound setting, the analyst supplies additional parameters that describe the sensitivity of . We design privacy wrappers for both settings and show that our wrappers are nearly optimal in terms of accuracy, locality (i.e., the depth of the local neighborhood of the dataset they explore), and query complexity. In the claimed sensitivity bound setting, we provide the first accuracy guarantees that have no dependence on the size of the universe .

Paper Structure

This paper contains 5 sections, 2 equations, 3 tables.

Theorems & Definitions (1)

  • Definition 1.1: Down sensitivity at specified depth