Table of Contents
Fetching ...

Streaming Private Continual Counting via Binning

Joel Daniel Andersson, Rasmus Pagh

TL;DR

This paper presents a simple approach to approximating factorization mechanisms in low space via binning, where adjacent matrix entries with similar values are changed to be identical in such a way that a matrix-vector product can be maintained in sublinear space.

Abstract

In differential privacy, $\textit{continual observation}$ refers to problems in which we wish to continuously release a function of a dataset that is revealed one element at a time. The challenge is to maintain a good approximation while keeping the combined output over all time steps differentially private. In the special case of $\textit{continual counting}$ we seek to approximate a sum of binary input elements. This problem has received considerable attention lately, in part due to its relevance in implementations of differentially private stochastic gradient descent. $\textit{Factorization mechanisms}$ are the leading approach to continual counting, but the best such mechanisms do not work well in $\textit{streaming}$ settings since they require space proportional to the size of the input. In this paper, we present a simple approach to approximating factorization mechanisms in low space via $\textit{binning}$, where adjacent matrix entries with similar values are changed to be identical in such a way that a matrix-vector product can be maintained in sublinear space. Our approach has provable sublinear space guarantees for a class of lower triangular matrices whose entries are monotonically decreasing away from the diagonal. We show empirically that even with very low space usage we are able to closely match, and sometimes surpass, the performance of asymptotically optimal factorization mechanisms. Recently, and independently of our work, Dvijotham et al. have also suggested an approach to implementing factorization mechanisms in a streaming setting. Their work differs from ours in several respects: It only addresses factorization into $\textit{Toeplitz}$ matrices, only considers $\textit{maximum}$ error, and uses a different technique based on rational function approximation that seems less versatile than our binning approach.

Streaming Private Continual Counting via Binning

TL;DR

This paper presents a simple approach to approximating factorization mechanisms in low space via binning, where adjacent matrix entries with similar values are changed to be identical in such a way that a matrix-vector product can be maintained in sublinear space.

Abstract

In differential privacy, refers to problems in which we wish to continuously release a function of a dataset that is revealed one element at a time. The challenge is to maintain a good approximation while keeping the combined output over all time steps differentially private. In the special case of we seek to approximate a sum of binary input elements. This problem has received considerable attention lately, in part due to its relevance in implementations of differentially private stochastic gradient descent. are the leading approach to continual counting, but the best such mechanisms do not work well in settings since they require space proportional to the size of the input. In this paper, we present a simple approach to approximating factorization mechanisms in low space via , where adjacent matrix entries with similar values are changed to be identical in such a way that a matrix-vector product can be maintained in sublinear space. Our approach has provable sublinear space guarantees for a class of lower triangular matrices whose entries are monotonically decreasing away from the diagonal. We show empirically that even with very low space usage we are able to closely match, and sometimes surpass, the performance of asymptotically optimal factorization mechanisms. Recently, and independently of our work, Dvijotham et al. have also suggested an approach to implementing factorization mechanisms in a streaming setting. Their work differs from ours in several respects: It only addresses factorization into matrices, only considers error, and uses a different technique based on rational function approximation that seems less versatile than our binning approach.

Paper Structure

This paper contains 23 sections, 19 theorems, 26 equations, 5 figures, 1 algorithm.

Key Result

Theorem 1

Let $B\in\mathbb{R}^{n\times n}$ be the Bennett matrix. Then for constant $\xi \in (0,1)$ there is a factorization $\hat{L}\hat{R}=B^2$ where:

Figures (5)

  • Figure 1: Illustration of the concept of binning on a 20-by-20 matrix. Each square corresponds to an entry in the matrix, with a darker shade illustrating a larger value. The binned matrix in \ref{['fig:binned_example']} uses at most 4 intervals per row, as illustrated by each row of the matrix having at most 4 shades.
  • Figure 2: Plot of entries of different factorizations of the counting matrix $A = A_{\alpha=1, \beta=0}$ for $n=50$. Darker shades imply matrix entries closer to 1; white implies 0. \ref{['fig:bennett_matrix']} corresponds to the factorization $A=B^2$. $\hat{L}$ is the $\mathcal{B}$-approximation of $B$ generated from \ref{['alg:matrix_to_binning']} when run with $c=0.75, \tau=0.02$ on $B$, and $\hat{R} = \hat{L}^{-1} A$. For the matrices shown we have that $\mathop{\mathrm{MeanSE}}\nolimits(\hat{L}, \hat{R}) / \mathop{\mathrm{MeanSE}}\nolimits(B,B) = 0.9965$, $\mathop{\mathrm{MaxSE}}\nolimits(\hat{L}, \hat{R}) / \mathop{\mathrm{MaxSE}}\nolimits(B, B) = 0.9951$ and $\lvert\mathcal{B}\rvert = 8$.
  • Figure 3: Plots showing the trade-off between space complexity and multiplicative blow-up in the mean and maximum squared error for our factorizations relative to henzinger_almost_2023. All factorizations shown here were produced by running \ref{['alg:matrix_to_binning']} with $c = 1 - 1/d$ for integers $d\geq 2$ and $\tau = 1/n$. In \ref{['fig:mse_space_vs_approx']} and \ref{['fig:maxse_space_vs_approx']}, $d$ was initialized to $2$ for each $n$, and then incremented until the error was sufficiently small for each curve. In \ref{['fig:mse_mult_vs_space']} and \ref{['fig:maxse_mult_vs_space']} the points are generated from using $1.1 \leq d \leq 100$. \ref{['fig:mse_space_vs_approx']} and \ref{['fig:maxse_space_vs_approx']} shows the space complexity (binning size) needed by the approximation $\hat{L}$, as a function of $n$, to be within a given multiple of this error. The dashed lines show the space complexity needed to be no more than some small fraction above the error, whereas the black line shows space needed for our method to achieve a smaller error. The line with dots shows the space complexity of the binary mechanism in chan_private_2011, whose error is asymptotically worse by a factor of $10$ or $21$, for mean and maximum squared error respectively. \ref{['fig:mse_mult_vs_space']} and \ref{['fig:maxse_mult_vs_space']} shows the trade-off between the blow-up in error versus space for different values of $n$.
  • Figure 4: Plot of entries of different factorizations of $A_{\alpha, \beta}$ for $n=50$. Darker shades imply matrix entries closer to 1; white implies 0. \ref{['fig:sqrt_alphabeta1']} and \ref{['fig:sqrt_alphabeta2']} correspond to the square-root factorization $A_{\alpha, \beta}=B_{\alpha, \beta}^2$. $\hat{L}$ is the $\mathcal{B}$-approximation of $B_{\alpha, \beta}$ generated from \ref{['alg:matrix_to_binning']} when run with $c$ and $\tau$ as specified in the figures, on $B_{\alpha, \beta}$, and where $\hat{R} = \hat{L}^{-1} A_{\alpha, \beta}$. The factorization shown in Fig. \ref{['fig:L_alphabeta1']} and \ref{['fig:R_alphabeta1']} achieves $\mathop{\mathrm{MeanSE}}\nolimits(\hat{L}, \hat{R}) / \mathop{\mathrm{MeanSE}}\nolimits(B_{\alpha, \beta},B_{\alpha, \beta}) = 0.9945$ and $\mathop{\mathrm{MaxSE}}\nolimits(\hat{L}, \hat{R}) / \mathop{\mathrm{MaxSE}}\nolimits(B_{\alpha, \beta},B_{\alpha, \beta}) = 0.9947$, compared to a relative error of $1.015$ and $1.026$ respectively for the factorization in Fig. \ref{['fig:L_alphabeta2']} and \ref{['fig:R_alphabeta2']}. Both our factorizations have a binning size $\lvert\mathcal{B}\rvert$ of $8$.
  • Figure 5: Plots showing the trade-off between space complexity and multiplicative blow-up in the mean and maximum squared error for our factorizations relative to the square-root factorization of $A_{\alpha, \beta}$, for different $\alpha, \beta$. All factorizations shown here were produced by running \ref{['alg:matrix_to_binning']} with $c = 1 - 1/d$ for $1.1 \leq d \leq 100$, and $\tau = 1/n$.

Theorems & Definitions (49)

  • Definition 1: Space complexity of $L$ for streaming
  • Theorem 1
  • Definition 2: $(\varepsilon, \delta)$-Differential Privacy dwork_algorithmic_2013
  • Lemma 1: Gaussian Mechanism dwork_algorithmic_2013
  • Definition 3
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • ...and 39 more