Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing

Xuwei Yang; Anastasis Kratsios; Florian Krach; Matheus Grasselli; Aurelien Lucchi

Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing

Xuwei Yang, Anastasis Kratsios, Florian Krach, Matheus Grasselli, Aurelien Lucchi

TL;DR

A nearly regret-optimal heuristic that runs with $\mathcal{O}(Np^2)$ fewer elementary operations, where $p$ is the dimension of the parameter space and an adversary which perturbs $q$ training pairs by at-most $\varepsilon>0$ across all training sets, is investigated.

Abstract

We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets ${\cal D}_1,\dots,{\cal D}_N$ for the same learning model $f_θ$. Our objective is to minimize the cumulative deviation of the generated parameters $\{θ_i(t)\}_{t=0}^T$ across all $T$ iterations from the specialized parameters $θ^\star_{1},\ldots,θ^\star_N$ obtained for each dataset, while respecting the loss function for the model $f_{θ(T)}$ produced by the algorithm upon halting. We only allow for continual communication between each of the specialized models (nodes/agents) and the central planner (server), at each iteration (round). For the case where the model $f_θ$ is a finite-rank kernel regression, we derive explicit updates for the regret-optimal algorithm. By leveraging symmetries within the regret-optimal algorithm, we further develop a nearly regret-optimal heuristic that runs with $\mathcal{O}(Np^2)$ fewer elementary operations, where $p$ is the dimension of the parameter space. Additionally, we investigate the adversarial robustness of the regret-optimal algorithm showing that an adversary which perturbs $q$ training pairs by at-most $\varepsilon>0$, across all training sets, cannot reduce the regret-optimal algorithm's regret by more than $\mathcal{O}(\varepsilon q \bar{N}^{1/2})$, where $\bar{N}$ is the aggregate number of training pairs. To validate our theoretical findings, we conduct numerical experiments in the context of American option pricing, utilizing a randomly generated finite-rank kernel.

Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing

TL;DR

A nearly regret-optimal heuristic that runs with

fewer elementary operations, where

is the dimension of the parameter space and an adversary which perturbs

training pairs by at-most

across all training sets, is investigated.

Abstract

We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets

for the same learning model

. Our objective is to minimize the cumulative deviation of the generated parameters

across all

iterations from the specialized parameters

obtained for each dataset, while respecting the loss function for the model

produced by the algorithm upon halting. We only allow for continual communication between each of the specialized models (nodes/agents) and the central planner (server), at each iteration (round). For the case where the model

is a finite-rank kernel regression, we derive explicit updates for the regret-optimal algorithm. By leveraging symmetries within the regret-optimal algorithm, we further develop a nearly regret-optimal heuristic that runs with

fewer elementary operations, where

is the dimension of the parameter space. Additionally, we investigate the adversarial robustness of the regret-optimal algorithm showing that an adversary which perturbs

training pairs by at-most

, across all training sets, cannot reduce the regret-optimal algorithm's regret by more than

, where

is the aggregate number of training pairs. To validate our theoretical findings, we conduct numerical experiments in the context of American option pricing, utilizing a randomly generated finite-rank kernel.

Paper Structure (7 sections, 1 theorem, 20 equations, 2 algorithms)

This paper contains 7 sections, 1 theorem, 20 equations, 2 algorithms.

Introduction
Contributions
Related Work
Algorithms
Main Guarantees
Statistical Guarantees - Algorithm \ref{['alg:RegretOptimizationHeuristic__WarmStart']}
Discussion: Optimal Weighting

Key Result

Theorem 1

Fix $L\ge 0$, let $\mathcal{F}$ be a non-empty family of $L$-Lipschitz functions mapping $\mathbb{R}^d$ to $\mathbb{R}^D$. There exists a constant $C\ge 1$ (depending only on $d+D$ and on $\mathcal{Z}$) such that: for every $0< \delta \le 1$, each $\eta\ge 0$, $\gamma>0$ and every $w\in \Delta_N$, t where $\bar{L}\stackrel{\hbox{\upshape\tiny def.}}{=} L_{\ell} \max\{1,L\}$, $\operatorname{KL}(\ma

Theorems & Definitions (2)

Remark 1
Theorem 1: Non-Asymptotic Transfer Learning Guarantee (General Lipschitz Learners)

Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing

TL;DR

Abstract

Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (2)