Table of Contents
Fetching ...

BlackCATT: Black-box Collusion Aware Traitor Tracing in Federated Learning

Elena Rodríguez-Lois, Fabio Brau, Maura Pintor, Battista Biggio, Fernando Pérez-González

TL;DR

BlackCATT tackles the challenge of black-box traitor tracing in Federated Learning under collusion, introducing a collusion-aware embedding loss and adversarial trigger-set optimization to preserve watermarking efficacy while maintaining main-task performance. A key novelty is the collusion-aware loss L^{CA} and the shared trigger set X optimization, augmented by functional regularization BlackCATT+FR to mitigate model drift across heterogeneous copies. Experimental results on ResNet18 and VGG16 across CIFAR-10 and CIFAR-100 demonstrate improved convergence, robust tracing under collusion (via MAV-based accusations and dynamic Z^{(t)} thresholds), and resilience to post-leak attacks like pruning and fine-tuning. Limitations include iid data assumptions and a fully trusted aggregator; future work includes extending to non-iid FL, untrusted aggregators, and non-classification tasks, with potential gains from larger trigger sets and more optimization rounds.

Abstract

Federated Learning has been popularized in recent years for applications involving personal or sensitive data, as it allows the collaborative training of machine learning models through local updates at the data-owners' premises, which does not require the sharing of the data itself. Considering the risk of leakage or misuse by any of the data-owners, many works attempt to protect their copyright, or even trace the origin of a potential leak through unique watermarks identifying each participant's model copy. Realistic accusation scenarios impose a black-box setting, where watermarks are typically embedded as a set of sample-label pairs. The threat of collusion, however, where multiple bad actors conspire together to produce an untraceable model, has been rarely addressed, and previous works have been limited to shallow networks and near-linearly separable main tasks. To the best of our knowledge, this work is the first to present a general collusion-resistant embedding method for black-box traitor tracing in Federated Learning: BlackCATT, which introduces a novel collusion-aware embedding loss term and, instead of using a fixed trigger set, iteratively optimizes the triggers to aid convergence and traitor tracing performance. Experimental results confirm the efficacy of the proposed scheme across different architectures and datasets. Furthermore, for models that would otherwise suffer from update incompatibility on the main task after learning different watermarks (e.g., architectures including batch normalization layers), our proposed BlackCATT+FR incorporates functional regularization through a set of auxiliary examples at the aggregator, promoting a shared feature space among model copies without compromising traitor tracing performance.

BlackCATT: Black-box Collusion Aware Traitor Tracing in Federated Learning

TL;DR

BlackCATT tackles the challenge of black-box traitor tracing in Federated Learning under collusion, introducing a collusion-aware embedding loss and adversarial trigger-set optimization to preserve watermarking efficacy while maintaining main-task performance. A key novelty is the collusion-aware loss L^{CA} and the shared trigger set X optimization, augmented by functional regularization BlackCATT+FR to mitigate model drift across heterogeneous copies. Experimental results on ResNet18 and VGG16 across CIFAR-10 and CIFAR-100 demonstrate improved convergence, robust tracing under collusion (via MAV-based accusations and dynamic Z^{(t)} thresholds), and resilience to post-leak attacks like pruning and fine-tuning. Limitations include iid data assumptions and a fully trusted aggregator; future work includes extending to non-iid FL, untrusted aggregators, and non-classification tasks, with potential gains from larger trigger sets and more optimization rounds.

Abstract

Federated Learning has been popularized in recent years for applications involving personal or sensitive data, as it allows the collaborative training of machine learning models through local updates at the data-owners' premises, which does not require the sharing of the data itself. Considering the risk of leakage or misuse by any of the data-owners, many works attempt to protect their copyright, or even trace the origin of a potential leak through unique watermarks identifying each participant's model copy. Realistic accusation scenarios impose a black-box setting, where watermarks are typically embedded as a set of sample-label pairs. The threat of collusion, however, where multiple bad actors conspire together to produce an untraceable model, has been rarely addressed, and previous works have been limited to shallow networks and near-linearly separable main tasks. To the best of our knowledge, this work is the first to present a general collusion-resistant embedding method for black-box traitor tracing in Federated Learning: BlackCATT, which introduces a novel collusion-aware embedding loss term and, instead of using a fixed trigger set, iteratively optimizes the triggers to aid convergence and traitor tracing performance. Experimental results confirm the efficacy of the proposed scheme across different architectures and datasets. Furthermore, for models that would otherwise suffer from update incompatibility on the main task after learning different watermarks (e.g., architectures including batch normalization layers), our proposed BlackCATT+FR incorporates functional regularization through a set of auxiliary examples at the aggregator, promoting a shared feature space among model copies without compromising traitor tracing performance.
Paper Structure (33 sections, 12 equations, 13 figures, 4 tables)

This paper contains 33 sections, 12 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: Accuracy of unique trigger-label sets (bar plot) and task accuracy (solid line) under different attacks, with a red dashed line indicating the random classification threshold for the trigger-label sets.
  • Figure 2: Conceptual representation of BlackCATT. The aggregator-side framework consists of five main steps: (1) Collection of trained model copies; (2) Task Arithmetic to update model copies; (3) Collusion-aware embedding of the unique watermarks; (4) Optimization of the Shared Trigger Set and (5) Return to data-owners. For a leak involving one or multiple malicious participants, the watermarks can be verified through black-box queries, either through the aggregator or an independent verifier.
  • Figure 3: Evolution of training metrics, with a vertical dashed line representing the FNR $\simeq 0.5$ threshold for $\mathcal{C}^2$.
  • Figure 4: Impact of the number of data-owners $N$, with a vertical dashed line representing the FNR $\simeq 0.5$ threshold for $\mathcal{C}^2$.
  • Figure 5: Traitor tracing capabilities against parameter averaging (solid line) and random layer sampling (dashed line) in terms of exposed triggers before an accusation ($t^*$) and FNR.
  • ...and 8 more figures