Table of Contents
Fetching ...

Decentralized Collaborative Learning with Adaptive Reference Data for On-Device POI Recommendation

Ruiqi Zheng, Liang Qu, Tong Chen, Lizhen Cui, Yuhui Shi, Hongzhi Yin

TL;DR

The paper tackles privacy-preserving, on-device POI recommendation in the presence of sparse personal data. It introduces DARD, which builds an adaptive reference-data mechanism by generating desensitized public pools and using loss tracking plus influence functions to select per-user reference data, enabling effective knowledge exchange via distillation with neighbors. Empirical results on Weeplace and Foursquare show that DARD outperforms centralized and decentralized baselines, remains robust with limited reference data, and is compatible with multiple base models. This work advances practical, privacy-conscious collaboration for personalized on-device recommendations with scalable data sharing through adaptive, per-user references.

Abstract

In Location-based Social Networks, Point-of-Interest (POI) recommendation helps users discover interesting places. There is a trend to move from the cloud-based model to on-device recommendations for privacy protection and reduced server reliance. Due to the scarcity of local user-item interactions on individual devices, solely relying on local instances is not adequate. Collaborative Learning (CL) emerges to promote model sharing among users, where reference data is an intermediary that allows users to exchange their soft decisions without directly sharing their private data or parameters, ensuring privacy and benefiting from collaboration. However, existing CL-based recommendations typically use a single reference for all users. Reference data valuable for one user might be harmful to another, given diverse user preferences. Users may not offer meaningful soft decisions on items outside their interest scope. Consequently, using the same reference data for all collaborations can impede knowledge exchange and lead to sub-optimal performance. To address this gap, we introduce the Decentralized Collaborative Learning with Adaptive Reference Data (DARD) framework, which crafts adaptive reference data for effective user collaboration. It first generates a desensitized public reference data pool with transformation and probability data generation methods. For each user, the selection of adaptive reference data is executed in parallel by training loss tracking and influence function. Local models are trained with individual private data and collaboratively with the geographical and semantic neighbors. During the collaboration between two users, they exchange soft decisions based on a combined set of their adaptive reference data. Our evaluations across two real-world datasets highlight DARD's superiority in recommendation performance and addressing the scarcity of available reference data.

Decentralized Collaborative Learning with Adaptive Reference Data for On-Device POI Recommendation

TL;DR

The paper tackles privacy-preserving, on-device POI recommendation in the presence of sparse personal data. It introduces DARD, which builds an adaptive reference-data mechanism by generating desensitized public pools and using loss tracking plus influence functions to select per-user reference data, enabling effective knowledge exchange via distillation with neighbors. Empirical results on Weeplace and Foursquare show that DARD outperforms centralized and decentralized baselines, remains robust with limited reference data, and is compatible with multiple base models. This work advances practical, privacy-conscious collaboration for personalized on-device recommendations with scalable data sharing through adaptive, per-user references.

Abstract

In Location-based Social Networks, Point-of-Interest (POI) recommendation helps users discover interesting places. There is a trend to move from the cloud-based model to on-device recommendations for privacy protection and reduced server reliance. Due to the scarcity of local user-item interactions on individual devices, solely relying on local instances is not adequate. Collaborative Learning (CL) emerges to promote model sharing among users, where reference data is an intermediary that allows users to exchange their soft decisions without directly sharing their private data or parameters, ensuring privacy and benefiting from collaboration. However, existing CL-based recommendations typically use a single reference for all users. Reference data valuable for one user might be harmful to another, given diverse user preferences. Users may not offer meaningful soft decisions on items outside their interest scope. Consequently, using the same reference data for all collaborations can impede knowledge exchange and lead to sub-optimal performance. To address this gap, we introduce the Decentralized Collaborative Learning with Adaptive Reference Data (DARD) framework, which crafts adaptive reference data for effective user collaboration. It first generates a desensitized public reference data pool with transformation and probability data generation methods. For each user, the selection of adaptive reference data is executed in parallel by training loss tracking and influence function. Local models are trained with individual private data and collaboratively with the geographical and semantic neighbors. During the collaboration between two users, they exchange soft decisions based on a combined set of their adaptive reference data. Our evaluations across two real-world datasets highlight DARD's superiority in recommendation performance and addressing the scarcity of available reference data.
Paper Structure (30 sections, 2 theorems, 12 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 30 sections, 2 theorems, 12 equations, 6 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

Discarding or downweighting the training samples in $\mathcal{D}^{'}_{\_} = \{ \mathcal{X}_j \in \mathcal{D}^{'} | \Psi_{\theta}(\mathcal{X}_j) > 0 \}$ from $\mathcal{D}^{'}$ could lead to a model with lower test risk over $\mathcal{Q}$: where $\hat{\theta_{\epsilon}}$ denotes optimal model parameters obtained by updating parameters with discarding or downweighting samples in $\mathcal{D}_{\_}$.

Figures (6)

  • Figure 1: Impact of reference data selection on recommendation performance. "Original" refers to no selection, whole reference data candidate pool for all users.
  • Figure 2: The overview of DARD. a) Step 1: generate desensitized sequences on-device and upload to server to aggregate as candidate pool. Server only involves in the initial stage to deploy pool and defines neighbors for the user. b) Model CL paradigm: user models are trained with individual data and collaboratively with neighbors. c) Step 2: train under CL paradigm with candidate pool and track loss to delete noisy reference data instances for target user. d) Step 3: utilize influence function to select adaptive reference data. e) Step 4: retrain under CL paradigm with adaptive reference data.
  • Figure 3: The performance with different amounts of reference data on Weeplace.
  • Figure 4: Performance of DARD integrated with different Recommendation Models.
  • Figure 5: Result of ablation experiment on different parts of DARD on Weeplace.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Lemma 1
  • Lemma 2