Table of Contents
Fetching ...

Regularized Robustly Reliable Learners and Instance Targeted Attacks

Avrim Blum, Donya Saless

TL;DR

This work tackles instance-targeted data poisoning by extending the robustly-reliable learning framework with regularized guarantees that remain informative for flexible hypothesis classes. It defines Regularized Robustly Reliable Learners (RRRLs) that, for a test point, output a prediction and an interval of complexity bounds $(c_{low}, c_{high})$ ensuring correctness of $f^*(x)$ when the corruption budget is $b$ and $f^*$ has complexity below $c_{high}$. The authors establish the optimal empirical reliability region ${\widehat{OPTR^4}}$, show conditions under which their RRRL achieves this optimum, and provide sample-complexity bounds under iid data. They develop efficient algorithms for key complexity measures, including Number of Alterations via bidirectional dynamic programming, Local Margin via test-time radii, and Global Margin via dynamic maximum matching on classification graphs, plus NP-hardness results for multi-class Global Margin. While computationally intensive in the general case, these methods give principled, per-instance guarantees against data poisoning and demonstrate practical routes to sublinear-time test-time certification in several natural settings.

Abstract

Instance-targeted data poisoning attacks, where an adversary corrupts a training set to induce errors on specific test points, have raised significant concerns. Balcan et al (2022) proposed an approach to addressing this challenge by defining a notion of robustly-reliable learners that provide per-instance guarantees of correctness under well-defined assumptions, even in the presence of data poisoning attacks. They then give a generic optimal (but computationally inefficient) robustly reliable learner as well as a computationally efficient algorithm for the case of linear separators over log-concave distributions. In this work, we address two challenges left open by Balcan et al (2022). The first is that the definition of robustly-reliable learners in Balcan et al (2022) becomes vacuous for highly-flexible hypothesis classes: if there are two classifiers h_0, h_1 \in H both with zero error on the training set such that h_0(x) \neq h_1(x), then a robustly-reliable learner must abstain on x. We address this problem by defining a modified notion of regularized robustly-reliable learners that allows for nontrivial statements in this case. The second is that the generic algorithm of Balcan et al (2022) requires re-running an ERM oracle (essentially, retraining the classifier) on each test point x, which is generally impractical even if ERM can be implemented efficiently. To tackle this problem, we show that at least in certain interesting cases we can design algorithms that can produce their outputs in time sublinear in training time, by using techniques from dynamic algorithm design.

Regularized Robustly Reliable Learners and Instance Targeted Attacks

TL;DR

This work tackles instance-targeted data poisoning by extending the robustly-reliable learning framework with regularized guarantees that remain informative for flexible hypothesis classes. It defines Regularized Robustly Reliable Learners (RRRLs) that, for a test point, output a prediction and an interval of complexity bounds ensuring correctness of when the corruption budget is and has complexity below . The authors establish the optimal empirical reliability region , show conditions under which their RRRL achieves this optimum, and provide sample-complexity bounds under iid data. They develop efficient algorithms for key complexity measures, including Number of Alterations via bidirectional dynamic programming, Local Margin via test-time radii, and Global Margin via dynamic maximum matching on classification graphs, plus NP-hardness results for multi-class Global Margin. While computationally intensive in the general case, these methods give principled, per-instance guarantees against data poisoning and demonstrate practical routes to sublinear-time test-time certification in several natural settings.

Abstract

Instance-targeted data poisoning attacks, where an adversary corrupts a training set to induce errors on specific test points, have raised significant concerns. Balcan et al (2022) proposed an approach to addressing this challenge by defining a notion of robustly-reliable learners that provide per-instance guarantees of correctness under well-defined assumptions, even in the presence of data poisoning attacks. They then give a generic optimal (but computationally inefficient) robustly reliable learner as well as a computationally efficient algorithm for the case of linear separators over log-concave distributions. In this work, we address two challenges left open by Balcan et al (2022). The first is that the definition of robustly-reliable learners in Balcan et al (2022) becomes vacuous for highly-flexible hypothesis classes: if there are two classifiers h_0, h_1 \in H both with zero error on the training set such that h_0(x) \neq h_1(x), then a robustly-reliable learner must abstain on x. We address this problem by defining a modified notion of regularized robustly-reliable learners that allows for nontrivial statements in this case. The second is that the generic algorithm of Balcan et al (2022) requires re-running an ERM oracle (essentially, retraining the classifier) on each test point x, which is generally impractical even if ERM can be implemented efficiently. To tackle this problem, we show that at least in certain interesting cases we can design algorithms that can produce their outputs in time sublinear in training time, by using techniques from dynamic algorithm design.

Paper Structure

This paper contains 25 sections, 11 theorems, 42 equations, 9 figures, 1 table, 1 algorithm.

Key Result

Theorem 3.1

For any RRR learner $\mathcal{L}'$ we have ${\rm \widehat{R^4}}_{\mathcal{L}'}(S',b,c) \subseteq {\rm \widehat{OPTR^4}}(S',b,c)$. Moreover, there exists an RRR learner $\mathcal{L}$ such that ${\rm \widehat{R^4}}_{\mathcal{L}}(S',b,c) = {\rm \widehat{OPTR^4}}(S',b,c)$.

Figures (9)

  • Figure 1: The blue regions depict ${\rm \widehat{OPTR^4}}(S',0,8)$ described in Definition \ref{['def:opterrr']} for the number-of-alternations complexity measure, mistake budget $b=0$, and complexity level $c=8$.
  • Figure 2: Number of Alterations with $\mathbb{R} \to \{+,-\}$ Data
  • Figure 3: Test point arrives
  • Figure 4: Illustration of a Function's Behavior on the Left and Right Sides of a Test Point: Leftmost: The function labels both the leftmost and rightmost neighbors of the test point as positive. Labeling the test point as positive does not increase complexity, but labeling it as negative increases the complexity by two. Middle Figures: The function labels the left neighbor as positive (or negative) and the right neighbor as negative (or positive). The complexity is the sum of the complexities on each side of the test point plus one, since the function needs to alter in order to connect the left side to the right side, regardless of the test point's label. Rightmost: The function labels both neighbors as negative. Labeling the test point as negative does not increase complexity, but labeling it as positive increases the complexity by two.
  • Figure 5: Local Margin example ($x_{test}$ at center)
  • ...and 4 more figures

Theorems & Definitions (59)

  • Definition 2.1: Regularized Robustly Reliable Learner
  • Remark 1
  • Remark 2
  • Remark 3
  • Definition 2.2: Empirical Regularized Robustly Reliable Region
  • Definition 2.3: Optimal Empirical Regularized Robustly Reliable Region
  • Theorem 3.1
  • proof
  • Definition 3.2: Regularized Robustly Reliable Region
  • Remark 4
  • ...and 49 more