Table of Contents
Fetching ...

Multitask Learning with Learned Task Relationships

Zirui Wan, Stefan Vlaski

TL;DR

This work addresses the inefficiency of consensus-based learning under heterogeneous task distributions by proposing a decentralized multitask framework that jointly learns inter-task relationships and local models. Task relationships are modeled via a Gaussian Markov Random Field with an unknown precision constrained to graph Laplacians, and a stochastic-gradient-based procedure estimates both the Laplacian and the local parameters. The authors derive finite-sample and asymptotic results, showing that Laplacian estimation errors vanish as the step-size shrinks and data accumulate, and establish asymptotic normality of the parameter estimates. Simulations demonstrate that leveraging a learned Laplacian accelerates adaptation and improves estimation over non-cooperative and consensus approaches, highlighting practical impact for large-scale, heterogeneous networks.

Abstract

Classical consensus-based strategies for federated and decentralized learning are statistically suboptimal in the presence of heterogeneous local data or task distributions. As a result, in recent years, there has been growing interest in multitask or personalized strategies, which allow individual agents to benefit from one another in pursuing locally optimal models without enforcing consensus. Existing strategies require either precise prior knowledge of the underlying task relationships or are fully non-parametric and instead rely on meta-learning or proximal constructions. In this work, we introduce an algorithmic framework that strikes a balance between these extremes. By modeling task relationships through a Gaussian Markov Random Field with an unknown precision matrix, we develop a strategy that jointly learns both the task relationships and the local models, allowing agents to self-organize in a way consistent with their individual data distributions. Our theoretical analysis quantifies the quality of the learned relationship, and our numerical experiments demonstrate its practical effectiveness.

Multitask Learning with Learned Task Relationships

TL;DR

This work addresses the inefficiency of consensus-based learning under heterogeneous task distributions by proposing a decentralized multitask framework that jointly learns inter-task relationships and local models. Task relationships are modeled via a Gaussian Markov Random Field with an unknown precision constrained to graph Laplacians, and a stochastic-gradient-based procedure estimates both the Laplacian and the local parameters. The authors derive finite-sample and asymptotic results, showing that Laplacian estimation errors vanish as the step-size shrinks and data accumulate, and establish asymptotic normality of the parameter estimates. Simulations demonstrate that leveraging a learned Laplacian accelerates adaptation and improves estimation over non-cooperative and consensus approaches, highlighting practical impact for large-scale, heterogeneous networks.

Abstract

Classical consensus-based strategies for federated and decentralized learning are statistically suboptimal in the presence of heterogeneous local data or task distributions. As a result, in recent years, there has been growing interest in multitask or personalized strategies, which allow individual agents to benefit from one another in pursuing locally optimal models without enforcing consensus. Existing strategies require either precise prior knowledge of the underlying task relationships or are fully non-parametric and instead rely on meta-learning or proximal constructions. In this work, we introduce an algorithmic framework that strikes a balance between these extremes. By modeling task relationships through a Gaussian Markov Random Field with an unknown precision matrix, we develop a strategy that jointly learns both the task relationships and the local models, allowing agents to self-organize in a way consistent with their individual data distributions. Our theoretical analysis quantifies the quality of the learned relationship, and our numerical experiments demonstrate its practical effectiveness.

Paper Structure

This paper contains 7 sections, 3 theorems, 29 equations, 4 figures.

Key Result

Theorem 1

For sufficiently small stepsize $\mu$ and as $i \to \infty$, the sequence generated by eq:GD recursion converges in distribution to an approximately conditional Gaussian probability1probability2: where $\Pi$ denotes the steady-state error covariance matrix, which depends on the realization of the true parameter ${ \mathcal{W}}^o$. In particular, $\Pi$ is the unique symmetric positive semidefinite

Figures (4)

  • Figure 1: Graph topology with assigned edge weights.
  • Figure 2: Covariance estimation error $\|\widehat{\Sigma}^{\perp}-L^{\dagger}\|^2$ versus $M$.
  • Figure 3: Laplacian estimation error $\|\widehat{L}-L\|^2$ versus $M$.
  • Figure 4: Learning performance of different algorithms.

Theorems & Definitions (3)

  • Theorem 1: Asymptotic Normality
  • Lemma 1: Covariance estimation error
  • Theorem 2: Laplacian estimation error