Table of Contents
Fetching ...

An Explainable Multi-Task Similarity Measure: Integrating Accumulated Local Effects and Weighted Fréchet Distance

Pablo Hidalgo, Daniel Rodriguez

TL;DR

This work introduces a post-hoc, model-agnostic multi-task similarity measure that combines Accumulated Local Effects (ALE) curves with a weighted Fréchet distance to quantify and explain task similarity. It weights ALE-based feature effects by importance and data-support, aligns curves on a common grid, and optionally scales similarity by task performance via a factor $\gamma$. A formal definition of the measure is provided, along with practical limitations and recommendations for non-tabular data and computational considerations. Empirical validation on synthetic data, Parkinson's disease data, bike-sharing, and CelebA demonstrates that the method yields interpretable, intuitive task relationships and supports clustering and exploratory analysis for multi-task learning.

Abstract

In many machine learning contexts, tasks are often treated as interconnected components with the goal of leveraging knowledge transfer between them, which is the central aim of Multi-Task Learning (MTL). Consequently, this multi-task scenario requires addressing critical questions: which tasks are similar, and how and why do they exhibit similarity? In this work, we propose a multi-task similarity measure based on Explainable Artificial Intelligence (XAI) techniques, specifically Accumulated Local Effects (ALE) curves. ALE curves are compared using the Fréchet distance, weighted by the data distribution, and the resulting similarity measure incorporates the importance of each feature. The measure is applicable in both single-task learning scenarios, where each task is trained separately, and multi-task learning scenarios, where all tasks are learned simultaneously. The measure is model-agnostic, allowing the use of different machine learning models across tasks. A scaling factor is introduced to account for differences in predictive performance across tasks, and several recommendations are provided for applying the measure in complex scenarios. We validate this measure using four datasets, one synthetic dataset and three real-world datasets. The real-world datasets include a well-known Parkinson's dataset and a bike-sharing usage dataset -- both structured in tabular format -- as well as the CelebA dataset, which is used to evaluate the application of concept bottleneck encoders in a multitask learning setting. The results demonstrate that the measure aligns with intuitive expectations of task similarity across both tabular and non-tabular data, making it a valuable tool for exploring relationships between tasks and supporting informed decision-making.

An Explainable Multi-Task Similarity Measure: Integrating Accumulated Local Effects and Weighted Fréchet Distance

TL;DR

This work introduces a post-hoc, model-agnostic multi-task similarity measure that combines Accumulated Local Effects (ALE) curves with a weighted Fréchet distance to quantify and explain task similarity. It weights ALE-based feature effects by importance and data-support, aligns curves on a common grid, and optionally scales similarity by task performance via a factor . A formal definition of the measure is provided, along with practical limitations and recommendations for non-tabular data and computational considerations. Empirical validation on synthetic data, Parkinson's disease data, bike-sharing, and CelebA demonstrates that the method yields interpretable, intuitive task relationships and supports clustering and exploratory analysis for multi-task learning.

Abstract

In many machine learning contexts, tasks are often treated as interconnected components with the goal of leveraging knowledge transfer between them, which is the central aim of Multi-Task Learning (MTL). Consequently, this multi-task scenario requires addressing critical questions: which tasks are similar, and how and why do they exhibit similarity? In this work, we propose a multi-task similarity measure based on Explainable Artificial Intelligence (XAI) techniques, specifically Accumulated Local Effects (ALE) curves. ALE curves are compared using the Fréchet distance, weighted by the data distribution, and the resulting similarity measure incorporates the importance of each feature. The measure is applicable in both single-task learning scenarios, where each task is trained separately, and multi-task learning scenarios, where all tasks are learned simultaneously. The measure is model-agnostic, allowing the use of different machine learning models across tasks. A scaling factor is introduced to account for differences in predictive performance across tasks, and several recommendations are provided for applying the measure in complex scenarios. We validate this measure using four datasets, one synthetic dataset and three real-world datasets. The real-world datasets include a well-known Parkinson's dataset and a bike-sharing usage dataset -- both structured in tabular format -- as well as the CelebA dataset, which is used to evaluate the application of concept bottleneck encoders in a multitask learning setting. The results demonstrate that the measure aligns with intuitive expectations of task similarity across both tabular and non-tabular data, making it a valuable tool for exploring relationships between tasks and supporting informed decision-making.
Paper Structure (26 sections, 16 equations, 18 figures, 9 tables, 2 algorithms)

This paper contains 26 sections, 16 equations, 18 figures, 9 tables, 2 algorithms.

Figures (18)

  • Figure 1: ALE curves of different features. Intuitively, curves 1 and 3 are more similar than curves 1 and 2.
  • Figure 2: Density of each predictor variable for each task in Synthetic Dataset 1.
  • Figure 3: ALE Curves for the Synthetic Dataset 1. The first row shows all the ALE curves of the different tasks for the same feature. The rest of the rows show ALE curves for each task and feature.
  • Figure 4: Multi-task similarity values.
  • Figure 5: ALE Curves for the task 6.
  • ...and 13 more figures