Table of Contents
Fetching ...

Watermarking Decision Tree Ensembles

Stefano Calzavara, Lorenzo Cazzaro, Donald Gera, Salvatore Orlando

TL;DR

The paper tackles intellectual-property protection for decision-tree ensembles, notably random forests, by embedding a watermark via a trigger set. The method learns two sub-ensembles and then composes the final watermark by selecting a tree from each sub-ensemble according to the binary signature $\{0,1\}^m$, with a trigger set $\mathcal{D}_{trigger}$ of size $k$ used during training to enforce targeted behavior. Security analysis shows robustness to detection and suppression, and proves watermark forgery is NP-hard via a 3SAT reduction. Experiments on public datasets demonstrate negligible accuracy loss and strong resistance to detection/suppression/forgery threats, indicating practical viability and suggesting extension to gradient-boosted ensembles.

Abstract

Protecting the intellectual property of machine learning models is a hot topic and many watermarking schemes for deep neural networks have been proposed in the literature. Unfortunately, prior work largely neglected the investigation of watermarking techniques for other types of models, including decision tree ensembles, which are a state-of-the-art model for classification tasks on non-perceptual data. In this paper, we present the first watermarking scheme designed for decision tree ensembles, focusing in particular on random forest models. We discuss watermark creation and verification, presenting a thorough security analysis with respect to possible attacks. We finally perform an experimental evaluation of the proposed scheme, showing excellent results in terms of accuracy and security against the most relevant threats.

Watermarking Decision Tree Ensembles

TL;DR

The paper tackles intellectual-property protection for decision-tree ensembles, notably random forests, by embedding a watermark via a trigger set. The method learns two sub-ensembles and then composes the final watermark by selecting a tree from each sub-ensemble according to the binary signature , with a trigger set of size used during training to enforce targeted behavior. Security analysis shows robustness to detection and suppression, and proves watermark forgery is NP-hard via a 3SAT reduction. Experiments on public datasets demonstrate negligible accuracy loss and strong resistance to detection/suppression/forgery threats, indicating practical viability and suggesting extension to gradient-boosted ensembles.

Abstract

Protecting the intellectual property of machine learning models is a hot topic and many watermarking schemes for deep neural networks have been proposed in the literature. Unfortunately, prior work largely neglected the investigation of watermarking techniques for other types of models, including decision tree ensembles, which are a state-of-the-art model for classification tasks on non-perceptual data. In this paper, we present the first watermarking scheme designed for decision tree ensembles, focusing in particular on random forest models. We discuss watermark creation and verification, presenting a thorough security analysis with respect to possible attacks. We finally perform an experimental evaluation of the proposed scheme, showing excellent results in terms of accuracy and security against the most relevant threats.
Paper Structure (13 sections, 1 theorem, 1 equation, 5 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 1 theorem, 1 equation, 5 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

The watermark forgery problem is NP-hard.

Figures (5)

  • Figure 1: Example of a decision tree ensemble with two trees.
  • Figure 2: Conversion of the example formula $(x_1 \vee x_2) \wedge (x_2 \vee x_3 \vee \neg x_4)$ into a tree ensemble.
  • Figure 3: Accuracy of watermarked models on the test set when varying the percentage of training instances included in $\mathcal{D}_{trigger}$ (top figure) and the percentage of bits set to 1 in the signature $\sigma$ (bottom figure).
  • Figure 4: Size of the forged trigger set $\mathcal{D}_{trigger}'$ when varying the amount of distortion $\varepsilon$ on the MNIST2-6 dataset.
  • Figure 5: Instances generated by Z3 for $\varepsilon \in \{0.3, 0.5, 0.7\}$.

Theorems & Definitions (2)

  • Definition 1
  • Theorem 1