Table of Contents
Fetching ...

Like Oil and Water: Group Robustness Methods and Poisoning Defenses May Be at Odds

Michael-Andrei Panaitescu-Liess, Yigitcan Kaya, Sicheng Zhu, Furong Huang, Tudor Dumitras

TL;DR

The paper uncovers a fundamental tension between group robustness methods and poisoning defenses in ML. It provides empirical evidence and a formal impossibility theorem showing that heuristics used to identify minority samples can misclassify poisons, leading to amplification and higher attacker success, while defenses that remove outliers can harm legitimate minority groups. The contributions include demonstrating that poisons are amplified by group-robustness techniques, showing disparate impacts of poisoning defenses on minority groups, and showing that naive combinations of defenses and robustness methods fail to reconcile the goals. The work underscores the risk of benchmark-driven scholarship obscuring essential trade-offs and calls for new methods that balance robustness to poisoning with equitable performance across groups.

Abstract

Group robustness has become a major concern in machine learning (ML) as conventional training paradigms were found to produce high error on minority groups. Without explicit group annotations, proposed solutions rely on heuristics that aim to identify and then amplify the minority samples during training. In our work, we first uncover a critical shortcoming of these methods: an inability to distinguish legitimate minority samples from poison samples in the training set. By amplifying poison samples as well, group robustness methods inadvertently boost the success rate of an adversary -- e.g., from $0\%$ without amplification to over $97\%$ with it. Notably, we supplement our empirical evidence with an impossibility result proving this inability of a standard heuristic under some assumptions. Moreover, scrutinizing recent poisoning defenses both in centralized and federated learning, we observe that they rely on similar heuristics to identify which samples should be eliminated as poisons. In consequence, minority samples are eliminated along with poisons, which damages group robustness -- e.g., from $55\%$ without the removal of the minority samples to $41\%$ with it. Finally, as they pursue opposing goals using similar heuristics, our attempt to alleviate the trade-off by combining group robustness methods and poisoning defenses falls short. By exposing this tension, we also hope to highlight how benchmark-driven ML scholarship can obscure the trade-offs among different metrics with potentially detrimental consequences.

Like Oil and Water: Group Robustness Methods and Poisoning Defenses May Be at Odds

TL;DR

The paper uncovers a fundamental tension between group robustness methods and poisoning defenses in ML. It provides empirical evidence and a formal impossibility theorem showing that heuristics used to identify minority samples can misclassify poisons, leading to amplification and higher attacker success, while defenses that remove outliers can harm legitimate minority groups. The contributions include demonstrating that poisons are amplified by group-robustness techniques, showing disparate impacts of poisoning defenses on minority groups, and showing that naive combinations of defenses and robustness methods fail to reconcile the goals. The work underscores the risk of benchmark-driven scholarship obscuring essential trade-offs and calls for new methods that balance robustness to poisoning with equitable performance across groups.

Abstract

Group robustness has become a major concern in machine learning (ML) as conventional training paradigms were found to produce high error on minority groups. Without explicit group annotations, proposed solutions rely on heuristics that aim to identify and then amplify the minority samples during training. In our work, we first uncover a critical shortcoming of these methods: an inability to distinguish legitimate minority samples from poison samples in the training set. By amplifying poison samples as well, group robustness methods inadvertently boost the success rate of an adversary -- e.g., from without amplification to over with it. Notably, we supplement our empirical evidence with an impossibility result proving this inability of a standard heuristic under some assumptions. Moreover, scrutinizing recent poisoning defenses both in centralized and federated learning, we observe that they rely on similar heuristics to identify which samples should be eliminated as poisons. In consequence, minority samples are eliminated along with poisons, which damages group robustness -- e.g., from without the removal of the minority samples to with it. Finally, as they pursue opposing goals using similar heuristics, our attempt to alleviate the trade-off by combining group robustness methods and poisoning defenses falls short. By exposing this tension, we also hope to highlight how benchmark-driven ML scholarship can obscure the trade-offs among different metrics with potentially detrimental consequences.

Paper Structure

This paper contains 22 sections, 2 theorems, 3 equations, 3 figures, 19 tables.

Key Result

Lemma 4.1

For the setting described above, if we assume that there are no ties in maximum expected class probability among groups, then the identification model has less expected class probability on the poisons $(y_m, a_m)^{*}$ in comparison to any legitimate group.

Figures (3)

  • Figure 1: Illustration of group robustness methods without (left) and with poisons (right) in the training set. For the former, the methods operate regularly as they identify and amplify the minority groups, but for the latter, they also amplify poisons and, therefore, the attacker's influence over the decision boundary. The left and the right panels of each figure correspond respectively to the identification phase of the group robustness methods and the resulting model trained after this phase. Note that the lighter-colored circles and triangles represent amplified points.
  • Figure 2: The elimination disparity between the under-represented (LRG) and over-represented (HRG) groups against EPIc. The x-axes show the iterations of EPIc, and y-axes show the Elimination-Factor (ELMF) for each group. From left to right, the first three plots are for DLBD, SA, GM attacks on Waterbirds and the last one is for DLBD on CelebA.
  • Figure 3: The percentage of samples from each group detected by STRIP when varying the threshold.

Theorems & Definitions (4)

  • Lemma 4.1
  • Theorem 4.2
  • proof
  • proof