Computationally Efficient Replicable Learning of Parities

Moshe Noivirt; Jessica Sorrell; Eliad Tsfadia

Computationally Efficient Replicable Learning of Parities

Moshe Noivirt, Jessica Sorrell, Eliad Tsfadia

TL;DR

The paper investigates the computational relationships between replicability and privacy in learning, and provides the first polynomial-time replicable algorithm for realizable learning of parity functions over arbitrary distributions. The main technical contribution is RepLinearSpan, a replicable subspace-identification subroutine that, given $m$ vectors in ${\mathbb F}_2^d$, outputs a subspace of their span capturing at least $1-\varepsilon$ of the vectors with $\rho$-replicability and running time $O(m^2 d^3)$. This subspace then enables a replicable parity learner by solving a linear system within the learned subspace, yielding a realizable $(\varepsilon,\delta)$-PAC learner for parity functions with sample complexity poly$(d,1/\rho,1/\varepsilon,\log(1/\delta))$. Collectively, the results show that efficient replicable learning over general distributions can extend beyond SQ-learning and approximate the power of differentially private learning, highlighting a closer computational alignment between replication and privacy than previously known.

Abstract

We study the computational relationship between replicability (Impagliazzo et al. [STOC `22], Ghazi et al. [NeurIPS `21]) and other stability notions. Specifically, we focus on replicable PAC learning and its connections to differential privacy (Dwork et al. [TCC 2006]) and to the statistical query (SQ) model (Kearns [JACM `98]). Statistically, it was known that differentially private learning and replicable learning are equivalent and strictly more powerful than SQ-learning. Yet, computationally, all previously known efficient (i.e., polynomial-time) replicable learning algorithms were confined to SQ-learnable tasks or restricted distributions, in contrast to differentially private learning. Our main contribution is the first computationally efficient replicable algorithm for realizable learning of parities over arbitrary distributions, a task that is known to be hard in the SQ-model, but possible under differential privacy. This result provides the first evidence that efficient replicable learning over general distributions strictly extends efficient SQ-learning, and is closer in power to efficient differentially private learning, despite computational separations between replicability and privacy. Our main building block is a new, efficient, and replicable algorithm that, given a set of vectors, outputs a subspace of their linear span that covers most of them.

Computationally Efficient Replicable Learning of Parities

TL;DR

vectors in

, outputs a subspace of their span capturing at least

of the vectors with

-replicability and running time

. This subspace then enables a replicable parity learner by solving a linear system within the learned subspace, yielding a realizable

-PAC learner for parity functions with sample complexity poly

. Collectively, the results show that efficient replicable learning over general distributions can extend beyond SQ-learning and approximate the power of differentially private learning, highlighting a closer computational alignment between replication and privacy than previously known.

Abstract

Paper Structure (22 sections, 11 theorems, 18 equations, 4 algorithms)

This paper contains 22 sections, 11 theorems, 18 equations, 4 algorithms.

Introduction
Our Results
Other Applications
Structure of Paper
Preliminaries
Notations
PAC Learning
Replicability
Replicable PAC Learning
Main Tools
Stable Partition Algorithm
PAC Learning of Parities
Replicable Linear Span
Replicability.
Coverage.
...and 7 more sections

Key Result

Theorem 1

There exists a polynomial-time $\rho$-replicable learning algorithm that (realizably) $(\varepsilon,\delta)$-PAC learns the class of parity functions over $\left\{0,1\right\}^d$ with sample complexity $poly(d,1/\rho,1/\varepsilon,\log(1/\delta))$.

Theorems & Definitions (31)

Theorem 1: Replicable Learning of Parities
Theorem 2: Replicable Linear Span
Corollary 1
Definition 2.1: (Realizable) PAC Learnability, see e.g., shalev2014understanding
Definition 2.2
Definition 2.3: Replicability impagliazzo2022reproducibility
Definition 2.4
Proposition 1
Theorem 3: McDiarmid's Inequality mcdiarmid1989method
Theorem 4: Stable Partition, Algorithm 1 in kaplan2025differentially
...and 21 more

Computationally Efficient Replicable Learning of Parities

TL;DR

Abstract

Computationally Efficient Replicable Learning of Parities

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (31)