Private Learning of Littlestone Classes, Revisited
Xin Lyu
TL;DR
This work studies private online and PAC learning for Littlestone classes under approximate differential privacy, revealing that private learning remains feasible with nontrivial guarantees tied to the Littlestone dimension. The authors introduce a $(p,d)$-decomposition and irreducibility framework, coupled with a private sparse selection mechanism based on a sparse Exponential Mechanism, to achieve a private online learner with a realizable-mistake bound of $\tilde{O}(d^{9.5}\log(T)\log(1/\delta)/\varepsilon)$ and a private PAC learner with sample complexity $\tilde{O}(\log(1/\delta)\,d^{5}/(\varepsilon \alpha))$. Their approach leverages interleaving hypothesis classes, the split-and-aggregate paradigm, and AboveThreshold primitives to implement a private halving-like process in the online setting, and the DP-ERM construction in the PAC setting. Overall, the results significantly tighten the known private online learning rates for Littlestone classes and strengthen the connection between private and online learning in both realizable and PAC contexts, with practical impact for privacy-preserving sequential decision tasks.
Abstract
We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of $\tilde{O}(d^{9.5}\cdot \log(T))$ in the realizable case, where $d$ denotes the Littlestone dimension and $T$ the time horizon. This is a doubly-exponential improvement over the state-of-the-art [GL'21] and comes polynomially close to the lower bound for this task. The advancement is made possible by a couple of ingredients. The first is a clean and refined interpretation of the ``irreducibility'' technique from the state-of-the-art private PAC-learner for Littlestone classes [GGKM'21]. Our new perspective also allows us to improve the PAC-learner of [GGKM'21] and give a sample complexity upper bound of $\widetilde{O}(\frac{d^5 \log(1/δβ)}{\varepsilon α})$ where $α$ and $β$ denote the accuracy and confidence of the PAC learner, respectively. This improves over [GGKM'21] by factors of $\frac{d}α$ and attains an optimal dependence on $α$. Our algorithm uses a private sparse selection algorithm to \emph{sample} from a pool of strongly input-dependent candidates. However, unlike most previous uses of sparse selection algorithms, where one only cares about the utility of output, our algorithm requires understanding and manipulating the actual distribution from which an output is drawn. In the proof, we use a sparse version of the Exponential Mechanism from [GKM'21] which behaves nicely under our framework and is amenable to a very easy utility proof.
