Table of Contents
Fetching ...

FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients

Zhuohua Li, Maoli Liu, John C. S. Lui

TL;DR

FedConPE targets federated conversational bandits with finite arms and heterogeneous clients by combining phase elimination with adaptive key-term selection. It exploits the eigenstructure of the clients’ information matrices to identify deficient directions and uses key terms to mitigate uncertainty without sharing raw arm data, yielding a regret bound of $\\mathcal{O}(\\tilde{O}(\\sqrt{dMT}))$ and a communication cost of $\\mathcal{O}(d^2M\\log T)$. Theoretical results establish near-minimax optimality with matching lower bounds, while empirical results on synthetic and real-world data show lower cumulative regret and fewer conversations compared to baselines. The approach enables privacy-preserving, communication-efficient distributed learning for conversational recommendations in heterogeneous federated settings, with practical impact on scalable, adaptive user-preference elicitation.

Abstract

Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences more efficiently. Nonetheless, most existing algorithms adopt a centralized approach. In this paper, we introduce FedConPE, a phase elimination-based federated conversational bandit algorithm, where $M$ agents collaboratively solve a global contextual linear bandit problem with the help of a central server while ensuring secure data management. To effectively coordinate all the clients and aggregate their collected data, FedConPE uses an adaptive approach to construct key terms that minimize uncertainty across all dimensions in the feature space. Furthermore, compared with existing federated linear bandit algorithms, FedConPE offers improved computational and communication efficiency as well as enhanced privacy protections. Our theoretical analysis shows that FedConPE is minimax near-optimal in terms of cumulative regret. We also establish upper bounds for communication costs and conversation frequency. Comprehensive evaluations demonstrate that FedConPE outperforms existing conversational bandit algorithms while using fewer conversations.

FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients

TL;DR

FedConPE targets federated conversational bandits with finite arms and heterogeneous clients by combining phase elimination with adaptive key-term selection. It exploits the eigenstructure of the clients’ information matrices to identify deficient directions and uses key terms to mitigate uncertainty without sharing raw arm data, yielding a regret bound of and a communication cost of . Theoretical results establish near-minimax optimality with matching lower bounds, while empirical results on synthetic and real-world data show lower cumulative regret and fewer conversations compared to baselines. The approach enables privacy-preserving, communication-efficient distributed learning for conversational recommendations in heterogeneous federated settings, with practical impact on scalable, adaptive user-preference elicitation.

Abstract

Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences more efficiently. Nonetheless, most existing algorithms adopt a centralized approach. In this paper, we introduce FedConPE, a phase elimination-based federated conversational bandit algorithm, where agents collaboratively solve a global contextual linear bandit problem with the help of a central server while ensuring secure data management. To effectively coordinate all the clients and aggregate their collected data, FedConPE uses an adaptive approach to construct key terms that minimize uncertainty across all dimensions in the feature space. Furthermore, compared with existing federated linear bandit algorithms, FedConPE offers improved computational and communication efficiency as well as enhanced privacy protections. Our theoretical analysis shows that FedConPE is minimax near-optimal in terms of cumulative regret. We also establish upper bounds for communication costs and conversation frequency. Comprehensive evaluations demonstrate that FedConPE outperforms existing conversational bandit algorithms while using fewer conversations.
Paper Structure (34 sections, 19 theorems, 34 equations, 9 figures, 1 table, 2 algorithms)

This paper contains 34 sections, 19 theorems, 34 equations, 9 figures, 1 table, 2 algorithms.

Key Result

lemma 1

Assume that $\mathcal{A} \subset {\mathbb R}^d$ is compact and $\text{span}(\mathcal{A}) = {\mathbb R}^d$, let $\pi: \mathcal{A} \to [0,1]$ be a distribution on $\mathcal{A}$ and define $\bm{V}(\pi)=\sum_{\bm{a} \in \mathcal{A}} \pi(\bm{a})\bm{a} \bm{a}^\mathsf{T}$, then there exists a minimizer $\p

Figures (9)

  • Figure 1: An example of conversational recommendation: ChatGPT sometimes gives two options for users to choose their preference.
  • Figure 2: Cumulative regret for the single-client scenario.
  • Figure 3: Cumulative regret for the multi-client scenario.
  • Figure 4: Cumulative regret v.s. number of clients.
  • Figure 5: Cumulative regret v.s. size of arm sets.
  • ...and 4 more figures

Theorems & Definitions (34)

  • lemma 1: ?
  • theorem 1: Regret upper bound
  • Remark 1
  • theorem 2: Regret lower bound
  • Remark 2
  • theorem 3: Communication cost
  • Remark 3
  • theorem 4: Conversation frequency upper bound
  • Remark 4
  • lemma 2: Subgaussian random variables
  • ...and 24 more