Private Online Learning against an Adaptive Adversary: Realizable and Agnostic Settings
Bo Li, Wei Wang, Peng Ye
TL;DR
This work studies private online learning for concept classes of finite Littlestone dimension under adaptive adversaries, addressing both realizable and agnostic settings. It delivers a realizable, $(\varepsilon,\delta)$-DP online learner with a logarithmic dependence on the time horizon $T$, albeit with a doubly exponential dependence on the Littlestone dimension $d$, using a lazy-update mechanism and uniform convergence. For the agnostic setting, it provides a DP online learner achieving sublinear regret $\tilde{O}_d(\sqrt{T})$ against adaptive adversaries, via two complementary strategies: batch-sanitization and privately constructed experts, with accompanying improvements for oblivious adversaries. The results extend private online learning beyond the realizable case, demonstrating privacy-preserving learnability for infinite Littlestone classes and connecting private online learning to private online prediction from experts, while leaving open the challenge of reducing dependence on $d$ and achieving proper/private realizability. Overall, the paper advances understanding of the trade-offs between privacy, adaptivity of the adversary, and learning performance in online settings.
Abstract
We revisit the problem of private online learning, in which a learner receives a sequence of $T$ data points and has to respond at each time-step a hypothesis. It is required that the entire stream of output hypotheses should satisfy differential privacy. Prior work of Golowich and Livni [2021] established that every concept class $\mathcal{H}$ with finite Littlestone dimension $d$ is privately online learnable in the realizable setting. In particular, they proposed an algorithm that achieves an $O_{d}(\log T)$ mistake bound against an oblivious adversary. However, their approach yields a suboptimal $\tilde{O}_{d}(\sqrt{T})$ bound against an adaptive adversary. In this work, we present a new algorithm with a mistake bound of $O_{d}(\log T)$ against an adaptive adversary, closing this gap. We further investigate the problem in the agnostic setting, which is more general than the realizable setting as it does not impose any assumptions on the data. We give an algorithm that obtains a sublinear regret of $\tilde{O}_d(\sqrt{T})$ for generic Littlestone classes, demonstrating that they are also privately online learnable in the agnostic setting.
