Table of Contents
Fetching ...

Robust multi-task boosting using clustering and local ensembling

Seyedsaman Emami, Daniel Hernández-Lobato, Gonzalo Martínez-Muñoz

TL;DR

RMB-CLE tackles negative transfer in multi-task learning by deriving inter-task relatedness from cross-task generalization errors and using adaptive hierarchical clustering to discover multiple task groups. It then trains cluster-specific local ensembles to share knowledge within coherent task groups while isolating incompatible tasks, providing a principled alternative to fixed clusters or binary inlier/outlier partitions. Theoretical analysis shows cross-task error decomposes into a functional distance and irreducible noise in regression, and upper-bounds excess risk via disagreement in classification, justifying the similarity measure. Empirically, RMB-CLE recovers latent task structure on synthetic data and consistently outperforms single-task, pooling, and existing multi-task boosting methods on real-world benchmarks, demonstrating robustness to distributional shift and task heterogeneity. The approach is model-agnostic, scalable in moderate settings, and establishes a general framework for robust, cluster-aware multi-task learning using error-driven task relationships.

Abstract

Multi-Task Learning (MTL) aims to boost predictive performance by sharing information across related tasks, yet conventional methods often suffer from negative transfer when unrelated or noisy tasks are forced to share representations. We propose Robust Multi-Task Boosting using Clustering and Local Ensembling (RMB-CLE), a principled MTL framework that integrates error-based task clustering with local ensembling. Unlike prior work that assumes fixed clusters or hand-crafted similarity metrics, RMB-CLE derives inter-task similarity directly from cross-task errors, which admit a risk decomposition into functional mismatch and irreducible noise, providing a theoretically grounded mechanism to prevent negative transfer. Tasks are grouped adaptively via agglomerative clustering, and within each cluster, a local ensemble enables robust knowledge sharing while preserving task-specific patterns. Experiments show that RMB-CLE recovers ground-truth clusters in synthetic data and consistently outperforms multi-task, single-task, and pooling-based ensemble methods across diverse real-world and synthetic benchmarks. These results demonstrate that RMB-CLE is not merely a combination of clustering and boosting but a general and scalable framework that establishes a new basis for robust multi-task learning.

Robust multi-task boosting using clustering and local ensembling

TL;DR

RMB-CLE tackles negative transfer in multi-task learning by deriving inter-task relatedness from cross-task generalization errors and using adaptive hierarchical clustering to discover multiple task groups. It then trains cluster-specific local ensembles to share knowledge within coherent task groups while isolating incompatible tasks, providing a principled alternative to fixed clusters or binary inlier/outlier partitions. Theoretical analysis shows cross-task error decomposes into a functional distance and irreducible noise in regression, and upper-bounds excess risk via disagreement in classification, justifying the similarity measure. Empirically, RMB-CLE recovers latent task structure on synthetic data and consistently outperforms single-task, pooling, and existing multi-task boosting methods on real-world benchmarks, demonstrating robustness to distributional shift and task heterogeneity. The approach is model-agnostic, scalable in moderate settings, and establishes a general framework for robust, cluster-aware multi-task learning using error-driven task relationships.

Abstract

Multi-Task Learning (MTL) aims to boost predictive performance by sharing information across related tasks, yet conventional methods often suffer from negative transfer when unrelated or noisy tasks are forced to share representations. We propose Robust Multi-Task Boosting using Clustering and Local Ensembling (RMB-CLE), a principled MTL framework that integrates error-based task clustering with local ensembling. Unlike prior work that assumes fixed clusters or hand-crafted similarity metrics, RMB-CLE derives inter-task similarity directly from cross-task errors, which admit a risk decomposition into functional mismatch and irreducible noise, providing a theoretically grounded mechanism to prevent negative transfer. Tasks are grouped adaptively via agglomerative clustering, and within each cluster, a local ensemble enables robust knowledge sharing while preserving task-specific patterns. Experiments show that RMB-CLE recovers ground-truth clusters in synthetic data and consistently outperforms multi-task, single-task, and pooling-based ensemble methods across diverse real-world and synthetic benchmarks. These results demonstrate that RMB-CLE is not merely a combination of clustering and boosting but a general and scalable framework that establishes a new basis for robust multi-task learning.
Paper Structure (48 sections, 46 equations, 8 figures, 19 tables, 1 algorithm)

This paper contains 48 sections, 46 equations, 8 figures, 19 tables, 1 algorithm.

Figures (8)

  • Figure 1: One-dimensional synthetic multi-task dataset with $\mathcal{T}=8$, $C=4$, and $\omega = 0.9$. Colors indicate clusters and markers denote tasks.
  • Figure 2: Synthetic task-wise Demšar plots ($p=0.05$) comparing multi-task models; solid lines indicate no significant differences (Nemenyi).
  • Figure 3: Cluster assignment stability over 100 runs. Each panel shows the fraction of times a task is assigned to each inferred cluster. Columns denote tasks and rows denote clusters; brighter colors indicate more stable assignments. Ground-truth clusters are reported in Table \ref{['tab:true_clusters']}.
  • Figure 4: Stability of sigmoid-based task weighting in R-MTGB over 100 runs. Each panel shows the learned task ($i$) weights $\sigma(\theta_i)$ across runs. Columns denote runs and rows denote tasks; color intensity reflects the magnitude of $\sigma(\theta_i)$, indicating how strongly tasks are weighted.
  • Figure 5: Real-world task-wise Demšar plots ($p=0.05$) comparing multi-task models; solid lines indicate no significant differences (Nemenyi).
  • ...and 3 more figures