Table of Contents
Fetching ...

Multi-task learning via robust regularized clustering with non-convex group penalties

Akira Okazaki, Shuichi Kawano

TL;DR

MTLRRC addresses robust, cluster-based multi-task learning by integrating robust regularized clustering with non-convex group penalties to simultaneously cluster tasks and detect outlier tasks. The method links robust clustering with a multivariate M-estimator and is estimated via a modified ADMM framework, offering both convex and non-convex variants. Simulation and real-data results show that non-convex penalties yield strong outlier detection performance and robust clustering, with practical interpretability of task structure. This approach enhances robustness and insight in settings where some tasks are substantially heterogeneous or unrelated to others, enabling more reliable cross-task learning.

Abstract

Multi-task learning (MTL) aims to improve estimation and prediction performance by sharing common information among related tasks. One natural assumption in MTL is that tasks are classified into clusters based on their characteristics. However, existing MTL methods based on this assumption often ignore outlier tasks that have large task-specific components or no relation to other tasks. To address this issue, we propose a novel MTL method called Multi-Task Learning via Robust Regularized Clustering (MTLRRC). MTLRRC incorporates robust regularization terms inspired by robust convex clustering, which is further extended to handle non-convex and group-sparse penalties. The extension allows MTLRRC to simultaneously perform robust task clustering and outlier task detection. The connection between the extended robust clustering and the multivariate M-estimator is also established. This provides an interpretation of the robustness of MTLRRC against outlier tasks. An efficient algorithm based on a modified alternating direction method of multipliers is developed for the estimation of the parameters. The effectiveness of MTLRRC is demonstrated through simulation studies and application to real data.

Multi-task learning via robust regularized clustering with non-convex group penalties

TL;DR

MTLRRC addresses robust, cluster-based multi-task learning by integrating robust regularized clustering with non-convex group penalties to simultaneously cluster tasks and detect outlier tasks. The method links robust clustering with a multivariate M-estimator and is estimated via a modified ADMM framework, offering both convex and non-convex variants. Simulation and real-data results show that non-convex penalties yield strong outlier detection performance and robust clustering, with practical interpretability of task structure. This approach enhances robustness and insight in settings where some tasks are substantially heterogeneous or unrelated to others, enabling more reliable cross-task learning.

Abstract

Multi-task learning (MTL) aims to improve estimation and prediction performance by sharing common information among related tasks. One natural assumption in MTL is that tasks are classified into clusters based on their characteristics. However, existing MTL methods based on this assumption often ignore outlier tasks that have large task-specific components or no relation to other tasks. To address this issue, we propose a novel MTL method called Multi-Task Learning via Robust Regularized Clustering (MTLRRC). MTLRRC incorporates robust regularization terms inspired by robust convex clustering, which is further extended to handle non-convex and group-sparse penalties. The extension allows MTLRRC to simultaneously perform robust task clustering and outlier task detection. The connection between the extended robust clustering and the multivariate M-estimator is also established. This provides an interpretation of the robustness of MTLRRC against outlier tasks. An efficient algorithm based on a modified alternating direction method of multipliers is developed for the estimation of the parameters. The effectiveness of MTLRRC is demonstrated through simulation studies and application to real data.
Paper Structure (23 sections, 3 theorems, 68 equations, 5 figures, 3 tables, 5 algorithms)

This paper contains 23 sections, 3 theorems, 68 equations, 5 figures, 3 tables, 5 algorithms.

Key Result

Proposition 1

Suppose that $\widehat{U}$ is a convergence point in Algorithm BCD_GRRC and $\bm{\psi}(\bm{o};\lambda,\gamma)=\bm{o} - \bm{\Theta}(\bm{o};\lambda,\gamma)$. Then, the $\widehat{U}$ satisfies the equation where $\Psi(X-\widehat{U};\lambda_{2},\gamma)$ is an $np$-dimensional vector defined as

Figures (5)

  • Figure 1: The multivariate loss functions (top row), the group-thresholding functions (bottom row). The x-axis and y-axis represent the values of input $\bm{z}\in\mathbb{R}^{2}$. The z-axis shows the first component of output $\Theta(\bm{z};\lambda,\gamma)$ in the top row, and the $\rho_{\lambda,\gamma}(\bm{z})$ in the bottom row. The values of $\lambda$ and $\gamma$ are fixed with three.
  • Figure 8: The mean of the estimated value of parameters in MTLRRC (GS$\gamma$) in 100 repetitions for the landmine data
  • Figure 12: The mean of the estimated value of parameters in MTLRRC (GS$\gamma$) in 100 repetitions for the school data
  • Figure 16: Ratio of $\widehat{\bm{o} }_{m}\neq\bm{0}$ for 100 repetitions in the landmine data
  • Figure 17: Ratio of $\widehat{\bm{o} }_{m}\neq\bm{0}$ for 100 repetitions in the school data

Theorems & Definitions (3)

  • Proposition 1
  • Proposition 2
  • Lemma 1: Shimmura2022-ar; Theorem 1