Table of Contents
Fetching ...

Timber! Poisoning Decision Trees

Stefano Calzavara, Lorenzo Cazzaro, Massimo Vettori

TL;DR

Timber advances adversarial research by introducing the first white-box poisoning attack for decision trees, leveraging a tree-annotation mechanism and subtree retraining to efficiently estimate and realize the impact of label flips. The approach scales to large datasets and extends to random forests, aided by an optional early-stopping variant to reduce runtime. Empirical results on four public datasets show Timber and its variant generally outperform baselines in both effectiveness and efficiency, with model-agnostic defenses providing only partial mitigation. This work highlights a critical vulnerability of tree-based methods and motivates developing defenses tailored to decision-tree ensembles and non-differentiable learners.

Abstract

We present Timber, the first white-box poisoning attack targeting decision trees. Timber is based on a greedy attack strategy that leverages sub-tree retraining to efficiently estimate the damage caused by poisoning a given training instance. The attack relies on a tree annotation procedure, which enables the sorting of training instances so that they are processed in increasing order of the computational cost of sub-tree retraining. This sorting yields a variant of Timber that supports an early stopping criterion, designed to make poisoning attacks more efficient and feasible on larger datasets. We also discuss an extension of Timber to traditional random forest models, which is valuable since decision trees are typically combined into ensembles to improve their predictive power. Our experimental evaluation on public datasets demonstrates that our attacks outperform existing baselines in terms of effectiveness, efficiency, or both. Moreover, we show that two representative defenses can mitigate the effect of our attacks, but fail to effectively thwart them.

Timber! Poisoning Decision Trees

TL;DR

Timber advances adversarial research by introducing the first white-box poisoning attack for decision trees, leveraging a tree-annotation mechanism and subtree retraining to efficiently estimate and realize the impact of label flips. The approach scales to large datasets and extends to random forests, aided by an optional early-stopping variant to reduce runtime. Empirical results on four public datasets show Timber and its variant generally outperform baselines in both effectiveness and efficiency, with model-agnostic defenses providing only partial mitigation. This work highlights a critical vulnerability of tree-based methods and motivates developing defenses tailored to decision-tree ensembles and non-differentiable learners.

Abstract

We present Timber, the first white-box poisoning attack targeting decision trees. Timber is based on a greedy attack strategy that leverages sub-tree retraining to efficiently estimate the damage caused by poisoning a given training instance. The attack relies on a tree annotation procedure, which enables the sorting of training instances so that they are processed in increasing order of the computational cost of sub-tree retraining. This sorting yields a variant of Timber that supports an early stopping criterion, designed to make poisoning attacks more efficient and feasible on larger datasets. We also discuss an extension of Timber to traditional random forest models, which is valuable since decision trees are typically combined into ensembles to improve their predictive power. Our experimental evaluation on public datasets demonstrates that our attacks outperform existing baselines in terms of effectiveness, efficiency, or both. Moreover, we show that two representative defenses can mitigate the effect of our attacks, but fail to effectively thwart them.
Paper Structure (21 sections, 2 equations, 7 figures, 5 tables, 3 algorithms)

This paper contains 21 sections, 2 equations, 7 figures, 5 tables, 3 algorithms.

Figures (7)

  • Figure 1: Example of decision tree.
  • Figure 2: Intuition of the Timber attack. If flipping the label of the instance $(\vec{x},y)$ does not invalidate the best split of the root and $\vec{x}$ falls in its left child, only the sub-tree in red (including 400 instances) may need retraining.
  • Figure 3: Splitting the dataset $\mathcal{D}$ based on the split $(f,v)$. Poisoning attacks can target positive or negative instances on the left or on the right of the split, leading to four attack possibilities that we must account for.
  • Figure 4: Empirical cumulative distribution function of the mean scores of the training instances averaged over the rounds of TES on the considered datasets. The scores range from 0 to 1.
  • Figure 5: F1 score of the attacked model under different poisoning attacks for budget $k$ equal to different percentages of poisoned training data, from 1% to 10%. A red horizontal line represent the F1 score of the model trained on the clean training set. Note that Timber is guaranteed to produce the F1 score loss as the Greedy attack strategy.
  • ...and 2 more figures