Table of Contents
Fetching ...

Robust-Multi-Task Gradient Boosting

Seyedsaman Emami, Gonzalo Martínez-Muñoz, Daniel Hernández-Lobato

TL;DR

This work addresses robustness in multi-task learning when tasks exhibit varying degrees of relatedness, including adversarial or outlier tasks. It introduces Robust-Multi-Task Gradient Boosting (R-MTGB), a three-block boosting framework that (1) learns a shared representation across all tasks, (2) performs outlier-aware task partitioning with sigmoid-based weights to down-weight disruptive tasks, and (3) fine-tunes task-specific predictors. The approach unifies shared learning, outlier handling, and per-task refinement within gradient boosting, with theoretical guarantees for Block2 and empirical validation across synthetic and real-world datasets. Results show that R-MTGB isolates outliers, promotes beneficial knowledge transfer, achieves lower per-task errors, and maintains strong overall performance, demonstrating robustness, adaptability, and interpretable task-level outlier scores.

Abstract

Multi-task learning (MTL) has shown effectiveness in exploiting shared information across tasks to improve generalization. MTL assumes tasks share similarities that can improve performance. In addition, boosting algorithms have demonstrated exceptional performance across diverse learning problems, primarily due to their ability to focus on hard-to-learn instances and iteratively reduce residual errors. This makes them a promising approach for learning multi-task problems. However, real-world MTL scenarios often involve tasks that are not well-aligned (known as outlier or adversarial tasks), which do not share beneficial similarities with others and can, in fact, deteriorate the performance of the overall model. To overcome this challenge, we propose Robust-Multi-Task Gradient Boosting (R-MTGB), a novel boosting framework that explicitly models and adapts to task heterogeneity during training. R-MTGB structures the learning process into three sequential blocks: (1) learning shared patterns, (2) partitioning tasks into outliers and non-outliers with regularized parameters, and (3) fine-tuning task-specific predictors. This architecture enables R-MTGB to automatically detect and penalize outlier tasks while promoting effective knowledge transfer among related tasks. Our method integrates these mechanisms seamlessly within gradient boosting, allowing robust handling of noisy or adversarial tasks without sacrificing accuracy. Extensive experiments on both synthetic benchmarks and real-world datasets demonstrate that our approach successfully isolates outliers, transfers knowledge, and consistently reduces prediction errors for each task individually, and achieves overall performance gains across all tasks. These results highlight robustness, adaptability, and reliable convergence of R-MTGB in challenging MTL environments.

Robust-Multi-Task Gradient Boosting

TL;DR

This work addresses robustness in multi-task learning when tasks exhibit varying degrees of relatedness, including adversarial or outlier tasks. It introduces Robust-Multi-Task Gradient Boosting (R-MTGB), a three-block boosting framework that (1) learns a shared representation across all tasks, (2) performs outlier-aware task partitioning with sigmoid-based weights to down-weight disruptive tasks, and (3) fine-tunes task-specific predictors. The approach unifies shared learning, outlier handling, and per-task refinement within gradient boosting, with theoretical guarantees for Block2 and empirical validation across synthetic and real-world datasets. Results show that R-MTGB isolates outliers, promotes beneficial knowledge transfer, achieves lower per-task errors, and maintains strong overall performance, demonstrating robustness, adaptability, and interpretable task-level outlier scores.

Abstract

Multi-task learning (MTL) has shown effectiveness in exploiting shared information across tasks to improve generalization. MTL assumes tasks share similarities that can improve performance. In addition, boosting algorithms have demonstrated exceptional performance across diverse learning problems, primarily due to their ability to focus on hard-to-learn instances and iteratively reduce residual errors. This makes them a promising approach for learning multi-task problems. However, real-world MTL scenarios often involve tasks that are not well-aligned (known as outlier or adversarial tasks), which do not share beneficial similarities with others and can, in fact, deteriorate the performance of the overall model. To overcome this challenge, we propose Robust-Multi-Task Gradient Boosting (R-MTGB), a novel boosting framework that explicitly models and adapts to task heterogeneity during training. R-MTGB structures the learning process into three sequential blocks: (1) learning shared patterns, (2) partitioning tasks into outliers and non-outliers with regularized parameters, and (3) fine-tuning task-specific predictors. This architecture enables R-MTGB to automatically detect and penalize outlier tasks while promoting effective knowledge transfer among related tasks. Our method integrates these mechanisms seamlessly within gradient boosting, allowing robust handling of noisy or adversarial tasks without sacrificing accuracy. Extensive experiments on both synthetic benchmarks and real-world datasets demonstrate that our approach successfully isolates outliers, transfers knowledge, and consistently reduces prediction errors for each task individually, and achieves overall performance gains across all tasks. These results highlight robustness, adaptability, and reliable convergence of R-MTGB in challenging MTL environments.

Paper Structure

This paper contains 20 sections, 45 equations, 9 figures, 13 tables, 1 algorithm.

Figures (9)

  • Figure 1: A visualization of the generated data points, comprising seven non-outlier (common) tasks (tasks 1 to 7) and one outlier task (task 8).
  • Figure 2: Average task-wise performance of the evaluated models over multiple runs shown separately for classification (left subplot) and regression (right subplot) tasks.
  • Figure 3: Mean and standard deviation of $\sigma(\boldsymbol{\theta})$ for each task learned by R-MTGB model on the generated synthetic multi-task data. Values of $\sigma(\boldsymbol{\theta})$ near 0 or 1 indicate task separation, with one extreme representing non-outlier tasks and the opposite extreme representing outlier tasks; the specific direction (0 = non-outlier vs. 1 = outlier, or vice versa) may depend on the problem.
  • Figure 4: A visualization of the distribution of training data points, comprising eight non-outlier tasks (Tasks 1-8) and two outlier tasks (Tasks 9-10).
  • Figure 5: Comparison of shared function estimation results by R-MTGB and MTGB for a representative non-outlier task (left subplot) and a representative outlier task (right subplot).
  • ...and 4 more figures