Table of Contents
Fetching ...

How to Forget Clients in Federated Online Learning to Rank?

Shuyi Wang, Bing Liu, Guido Zuccon

TL;DR

This work addresses the right-to-forget in Federated Online Learning to Rank (FOLTR) by introducing an unlearning mechanism adapted from FedEraser that uses stored historical updates to remove a departing client's contributions without retraining from scratch. The authors evaluate forgetting via a poisoning-attack based verification across four learning-to-rank datasets, showing that the unlearned global ranker converges toward the baseline retrained model and that the approach reduces both computation and communication compared with full retraining. They also analyze key hyperparameters to balance efficiency and effectiveness and provide a first unlearning benchmark for FOLTR. The results support the practical viability of privacy-preserving unlearning in distributed online ranking systems, with implications for GDPR compliance in federated search services.

Abstract

Data protection legislation like the European Union's General Data Protection Regulation (GDPR) establishes the \textit{right to be forgotten}: a user (client) can request contributions made using their data to be removed from learned models. In this paper, we study how to remove the contributions made by a client participating in a Federated Online Learning to Rank (FOLTR) system. In a FOLTR system, a ranker is learned by aggregating local updates to the global ranking model. Local updates are learned in an online manner at a client-level using queries and implicit interactions that have occurred within that specific client. By doing so, each client's local data is not shared with other clients or with a centralised search service, while at the same time clients can benefit from an effective global ranking model learned from contributions of each client in the federation. In this paper, we study an effective and efficient unlearning method that can remove a client's contribution without compromising the overall ranker effectiveness and without needing to retrain the global ranker from scratch. A key challenge is how to measure whether the model has unlearned the contributions from the client $c^*$ that has requested removal. For this, we instruct $c^*$ to perform a poisoning attack (add noise to this client updates) and then we measure whether the impact of the attack is lessened when the unlearning process has taken place. Through experiments on four datasets, we demonstrate the effectiveness and efficiency of the unlearning strategy under different combinations of parameter settings.

How to Forget Clients in Federated Online Learning to Rank?

TL;DR

This work addresses the right-to-forget in Federated Online Learning to Rank (FOLTR) by introducing an unlearning mechanism adapted from FedEraser that uses stored historical updates to remove a departing client's contributions without retraining from scratch. The authors evaluate forgetting via a poisoning-attack based verification across four learning-to-rank datasets, showing that the unlearned global ranker converges toward the baseline retrained model and that the approach reduces both computation and communication compared with full retraining. They also analyze key hyperparameters to balance efficiency and effectiveness and provide a first unlearning benchmark for FOLTR. The results support the practical viability of privacy-preserving unlearning in distributed online ranking systems, with implications for GDPR compliance in federated search services.

Abstract

Data protection legislation like the European Union's General Data Protection Regulation (GDPR) establishes the \textit{right to be forgotten}: a user (client) can request contributions made using their data to be removed from learned models. In this paper, we study how to remove the contributions made by a client participating in a Federated Online Learning to Rank (FOLTR) system. In a FOLTR system, a ranker is learned by aggregating local updates to the global ranking model. Local updates are learned in an online manner at a client-level using queries and implicit interactions that have occurred within that specific client. By doing so, each client's local data is not shared with other clients or with a centralised search service, while at the same time clients can benefit from an effective global ranking model learned from contributions of each client in the federation. In this paper, we study an effective and efficient unlearning method that can remove a client's contribution without compromising the overall ranker effectiveness and without needing to retrain the global ranker from scratch. A key challenge is how to measure whether the model has unlearned the contributions from the client that has requested removal. For this, we instruct to perform a poisoning attack (add noise to this client updates) and then we measure whether the impact of the attack is lessened when the unlearning process has taken place. Through experiments on four datasets, we demonstrate the effectiveness and efficiency of the unlearning strategy under different combinations of parameter settings.
Paper Structure (13 sections, 3 equations, 3 figures, 3 tables)

This paper contains 13 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Relationships between FOLTR configurations: 9H-1M (green line), 10H-0M (black), 9H-0M (pink). Circles are clients.
  • Figure 2: Offline effectiveness (nDCG@10) obtained under the 9H-1M (green line), 10H-0M (black line), 9H-0M (pink line) FOLTR configurations with three click modes (Perfect, Navigational, Informational). Results are averaged across all dataset splits and experimental runs. These results motivate the use of the evaluation methodology based on the malicious client to evaluate the effectiveness of unlearning.
  • Figure 3: Comparison between the offline effectiveness (nDCG@10) after the unlearning method is applied (ranker $\mathcal{U}$(9H-1M) denoted as "unlearn") and the ranker is retrained from scratch after client $c^*$ is removed (ranker 9H-0M denoted as "baseline"). For the unlearning process, we set $n'_i = 3$ with $\Delta t = 10$ and show the evaluation values across all global steps. For the baseline setup, we only show the final nDCG@10 score after retraining finishes.