Table of Contents
Fetching ...

Robust Watermarking on Gradient Boosting Decision Trees

Jun Woo Chung, Yingjie Lao, Weijie Zhao

TL;DR

This paper addresses robust ownership protection for gradient boosting decision trees (GBDT) by introducing an in-place watermarking framework that embeds watermarks through fine-tuning existing trees rather than adding new ones. It proposes four embedding strategies—Wrong Prediction Flip, Outlier Flip, Cluster Center Flip, and Confidence Flip—to minimize accuracy impact while ensuring watermark robustness. Across diverse datasets, the methods achieve high watermarking effectiveness with limited degradation to general performance and demonstrate resilience to further fine-tuning, enabling post-deployment ownership verification. The work advances practical IP protection for GBDT models in industry and academia, providing guidance on strategy selection depending on data context and offering a path toward robust, post-hoc watermarking of non-differentiable, sequential tree ensembles.

Abstract

Gradient Boosting Decision Trees (GBDTs) are widely used in industry and academia for their high accuracy and efficiency, particularly on structured data. However, watermarking GBDT models remains underexplored compared to neural networks. In this work, we present the first robust watermarking framework tailored to GBDT models, utilizing in-place fine-tuning to embed imperceptible and resilient watermarks. We propose four embedding strategies, each designed to minimize impact on model accuracy while ensuring watermark robustness. Through experiments across diverse datasets, we demonstrate that our methods achieve high watermark embedding rates, low accuracy degradation, and strong resistance to post-deployment fine-tuning.

Robust Watermarking on Gradient Boosting Decision Trees

TL;DR

This paper addresses robust ownership protection for gradient boosting decision trees (GBDT) by introducing an in-place watermarking framework that embeds watermarks through fine-tuning existing trees rather than adding new ones. It proposes four embedding strategies—Wrong Prediction Flip, Outlier Flip, Cluster Center Flip, and Confidence Flip—to minimize accuracy impact while ensuring watermark robustness. Across diverse datasets, the methods achieve high watermarking effectiveness with limited degradation to general performance and demonstrate resilience to further fine-tuning, enabling post-deployment ownership verification. The work advances practical IP protection for GBDT models in industry and academia, providing guidance on strategy selection depending on data context and offering a path toward robust, post-hoc watermarking of non-differentiable, sequential tree ensembles.

Abstract

Gradient Boosting Decision Trees (GBDTs) are widely used in industry and academia for their high accuracy and efficiency, particularly on structured data. However, watermarking GBDT models remains underexplored compared to neural networks. In this work, we present the first robust watermarking framework tailored to GBDT models, utilizing in-place fine-tuning to embed imperceptible and resilient watermarks. We propose four embedding strategies, each designed to minimize impact on model accuracy while ensuring watermark robustness. Through experiments across diverse datasets, we demonstrate that our methods achieve high watermark embedding rates, low accuracy degradation, and strong resistance to post-deployment fine-tuning.

Paper Structure

This paper contains 22 sections, 10 equations, 4 figures, 7 tables, 2 algorithms.

Figures (4)

  • Figure 1: Illustration of in-place updating process for a single tree for a initial gradient boosting model trained on dataset $\mathcal{D}$, as detailed in Algorithm \ref{['alg:inplace']}. If the optimal split of a root node of a subtree changes due to the additional data ($\mathcal{D}_{\text{fine}}$), the corresponding subtree is retrained. If the optimal split is not changed, retraining is not needed, and only the leaf nodes for which any of the additional data corresponds to needs to be updated to reflect the update.
  • Figure 2: Example of an initial model, watermark selection, and embedding for the different selection methods. In (a), the decision boundaries and predictions of the initial model are shown as background shading, with the $\mathcal{D}_{\text{cand}}$ dataset overlaid in feature space using colored circles to represent ground truth labels. (b1) highlights watermark candidates for the Wrong Prediction Flip approach, or samples misclassified by the model, evident where circle colors differ from the background, and the selected watermarks outlined with thick edges for $\textit{n} = 4$ and $\textit{k} = 3$. (c1) displays the modified labels used during fine-tuning (second most probable incorrect class), and the anticipated boundary adjustments. (b2) and (c2) follow the same visual conventions for the Outlier Flip approach, where selected watermarks are the samples most distant from others in feature space, and the new label is the highest-probability incorrect class. (b3) shows the Cluster Center Flip strategy, where cluster centroids are selected as watermark candidates. Their $\textit{l}$ nearest neighbors, which reinforce the original labels, are marked with triangles. (c3) depicts the label and boundary changes resulting from tuning the watermarks and neighbors. Finally, (b4) and (c4) illustrate the Confidence Flip approach, where low-confidence correct predictions (often near decision boundaries) are selected.
  • Figure 3: The proportion of incorrect predictions made by our initial model that are also incorrect w.r.t. models trained using other GBDT libraries w.r.t. optdigit dataset. The relatively high similarity demonstrates that simply using the incorrect predictions as watermarks risks them simply being "hard" samples, thus meaning unrelated models can make similar predictions to the watermark, leading to ambiguity problems.
  • Figure : GBDT In-place updating