Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

Afiya Ayman; Ayan Mukhopadhyay; Aron Laszka

Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

Afiya Ayman, Ayan Mukhopadhyay, Aron Laszka

TL;DR

ETAP (Ensemble Task Affinity Predictor), a scalable framework that integrates principled and data-driven estimators to predict MTL performance gains, is proposed and demonstrated on benchmark datasets that ETAP improves MTL gain prediction and enables more effective task grouping.

Abstract

A fundamental problem in multi-task learning (MTL) is identifying groups of tasks that should be learned together. Since training MTL models for all possible combinations of tasks is prohibitively expensive for large task sets, a crucial component of efficient and effective task grouping is predicting whether a group of tasks would benefit from learning together, measured as per-task performance gain over single-task learning. In this paper, we propose ETAP (Ensemble Task Affinity Predictor), a scalable framework that integrates principled and data-driven estimators to predict MTL performance gains. First, we consider the gradient-based updates of shared parameters in an MTL model to measure the affinity between a pair of tasks as the similarity between the parameter updates based on these tasks. This linear estimator, which we call affinity score, naturally extends to estimating affinity within a group of tasks. Second, to refine these estimates, we train predictors that apply non-linear transformations and correct residual errors, capturing complex and non-linear task relationships. We train these predictors on a limited number of task groups for which we obtain ground-truth gain values via multi-task learning for each group. We demonstrate on benchmark datasets that ETAP improves MTL gain prediction and enables more effective task grouping, outperforming state-of-the-art baselines across diverse application domains.

Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

TL;DR

Abstract

Paper Structure (45 sections, 12 equations, 7 figures, 12 tables)

This paper contains 45 sections, 12 equations, 7 figures, 12 tables.

Introduction
Contributions:
Problem Definition
ETAP: Ensemble Task-Affinity Prediction
Task-Affinity Score via White-Box Analysis
Task Loss and Gradients
Shared Parameter Updates
Affinity Score
Data-Driven Ensemble Prediction
Affinity Scores to MTL Gains with Non-Linear Mapping
Data-driven Residual Prediction for Improved Accuracy
Group Selection based on MTL Gain Predictions
Experiments and Results
Experimental Setup
Dataset and MTL Architecture
...and 30 more sections

Figures (7)

Figure 1: Visualizing ETAP: From White-box Task Affinity Scores to Data-driven Ensembled MTL Gain Predictions. Task affinity computed from a baseline MTL model and ground-truth MTL gains are fed into an ensemble framework. Non-linear transformations yield initial predictions, which are later refined by residual correction through regularized regression.
Figure 2: Prediction performance ($R^2$) vs. computational cost($|\mathcal{G}_{\text{train}}|$) for data-driven predictors, MTGNet and ETAP. MTGNet suffers from instability, with means outside the IQR due to outliers, whereas ETAP maintains consistency.
Figure 3: Comparison of prediction performance ($R^2$) between the initial predictions (without residual adjustment) and the final predictions (with residual adjustment) across four datasets. The residual adjustment step improves accuracy, particularly on CelebA, Chemical, and Ridership, with a slight improvement observed on ETTm1.
Figure 4: Runtime for CelebA and ETTm1 datasets (measured in hours) and the Chemical dataset (measured in minutes). Running times include task-affinity scores calculations and complete multi-task learning training for all tasks together in the benchmark using baseline MTL models.
Figure 5: Comparison of prediction performance ($R^2$) for different approaches applied in the first step of the ensemble model ($f_{\text{non-linear}}$) on the CelebA, ETTm1, Chemical, and Ridership datasets. These performances are reported after applying a residual correction step ($f_{\text{residual}}$), where regularized ridge regression is used for residual prediction.
...and 2 more figures

Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

TL;DR

Abstract

Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)