Multi-task learning via robust regularized clustering with non-convex group penalties

Akira Okazaki; Shuichi Kawano

Multi-task learning via robust regularized clustering with non-convex group penalties

Akira Okazaki, Shuichi Kawano

TL;DR

MTLRRC addresses robust, cluster-based multi-task learning by integrating robust regularized clustering with non-convex group penalties to simultaneously cluster tasks and detect outlier tasks. The method links robust clustering with a multivariate M-estimator and is estimated via a modified ADMM framework, offering both convex and non-convex variants. Simulation and real-data results show that non-convex penalties yield strong outlier detection performance and robust clustering, with practical interpretability of task structure. This approach enhances robustness and insight in settings where some tasks are substantially heterogeneous or unrelated to others, enabling more reliable cross-task learning.

Abstract

Multi-task learning (MTL) aims to improve estimation and prediction performance by sharing common information among related tasks. One natural assumption in MTL is that tasks are classified into clusters based on their characteristics. However, existing MTL methods based on this assumption often ignore outlier tasks that have large task-specific components or no relation to other tasks. To address this issue, we propose a novel MTL method called Multi-Task Learning via Robust Regularized Clustering (MTLRRC). MTLRRC incorporates robust regularization terms inspired by robust convex clustering, which is further extended to handle non-convex and group-sparse penalties. The extension allows MTLRRC to simultaneously perform robust task clustering and outlier task detection. The connection between the extended robust clustering and the multivariate M-estimator is also established. This provides an interpretation of the robustness of MTLRRC against outlier tasks. An efficient algorithm based on a modified alternating direction method of multipliers is developed for the estimation of the parameters. The effectiveness of MTLRRC is demonstrated through simulation studies and application to real data.

Multi-task learning via robust regularized clustering with non-convex group penalties

TL;DR

Abstract

Paper Structure (23 sections, 3 theorems, 68 equations, 5 figures, 3 tables, 5 algorithms)

This paper contains 23 sections, 3 theorems, 68 equations, 5 figures, 3 tables, 5 algorithms.

Introduction
Motivation and methodology
Problem set-up
Multi-task learning via convex clustering
Robust convex clustering
Non-convex extensions of robust convex clustering
Proposed method
Multi-task learning via robust regularized clustering
Interpretation through the BCD algorithm
Convex case
Non-convex case
Estimation algorithm via modified ADMM
Simulation studies
Application to real datasets
Conclusion
...and 8 more sections

Key Result

Proposition 1

Suppose that $\widehat{U}$ is a convergence point in Algorithm BCD_GRRC and $\bm{\psi}(\bm{o};\lambda,\gamma)=\bm{o} - \bm{\Theta}(\bm{o};\lambda,\gamma)$. Then, the $\widehat{U}$ satisfies the equation where $\Psi(X-\widehat{U};\lambda_{2},\gamma)$ is an $np$-dimensional vector defined as

Figures (5)

Figure 1: The multivariate loss functions (top row), the group-thresholding functions (bottom row). The x-axis and y-axis represent the values of input $\bm{z}\in\mathbb{R}^{2}$. The z-axis shows the first component of output $\Theta(\bm{z};\lambda,\gamma)$ in the top row, and the $\rho_{\lambda,\gamma}(\bm{z})$ in the bottom row. The values of $\lambda$ and $\gamma$ are fixed with three.
Figure 8: The mean of the estimated value of parameters in MTLRRC (GS$\gamma$) in 100 repetitions for the landmine data
Figure 12: The mean of the estimated value of parameters in MTLRRC (GS$\gamma$) in 100 repetitions for the school data
Figure 16: Ratio of $\widehat{\bm{o} }_{m}\neq\bm{0}$ for 100 repetitions in the landmine data
Figure 17: Ratio of $\widehat{\bm{o} }_{m}\neq\bm{0}$ for 100 repetitions in the school data

Theorems & Definitions (3)

Proposition 1
Proposition 2
Lemma 1: Shimmura2022-ar; Theorem 1

Multi-task learning via robust regularized clustering with non-convex group penalties

TL;DR

Abstract

Multi-task learning via robust regularized clustering with non-convex group penalties

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (3)