Table of Contents
Fetching ...

When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review

Maxime Fontana, Michael Spratling, Miaojing Shi

TL;DR

This review surveys how computer vision can exploit multi-task learning under partial supervision, detailing parameter-sharing, fusion, decomposition, and NAS strategies to balance tasks and mitigate data labeling needs. It clarifies optimization challenges—including loss weighting, gradient conflicts, and Pareto-front approaches—and surveys task-grouping and partially supervised techniques (self-supervised, semi-supervised, few-shot) to improve data efficiency. The authors synthesize datasets, tools, and benchmarking results to provide practical guidance for building scalable, data-efficient MTL CV systems. Overall, partially supervised MTL can match or exceed fully supervised performance in many settings, emphasizing the importance of task relationships, adaptive balancing, and comprehensive benchmarking. The work highlights future opportunities in large-scale task pools, adaptive sharing mechanisms, and cross-task learning in diverse CV domains.$

Abstract

Multi-Task Learning (MTL) aims to learn multiple tasks simultaneously while exploiting their mutual relationships. By using shared resources to simultaneously calculate multiple outputs, this learning paradigm has the potential to have lower memory requirements and inference times compared to the traditional approach of using separate methods for each task. Previous work in MTL has mainly focused on fully-supervised methods, as task relationships can not only be leveraged to lower the level of data-dependency of those methods but they can also improve performance. However, MTL introduces a set of challenges due to a complex optimisation scheme and a higher labeling requirement. This review focuses on how MTL could be utilised under different partial supervision settings to address these challenges. First, this review analyses how MTL traditionally uses different parameter sharing techniques to transfer knowledge in between tasks. Second, it presents the different challenges arising from such a multi-objective optimisation scheme. Third, it introduces how task groupings can be achieved by analysing task relationships. Fourth, it focuses on how partially supervised methods applied to MTL can tackle the aforementioned challenges. Lastly, this review presents the available datasets, tools and benchmarking results of such methods.

When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review

TL;DR

This review surveys how computer vision can exploit multi-task learning under partial supervision, detailing parameter-sharing, fusion, decomposition, and NAS strategies to balance tasks and mitigate data labeling needs. It clarifies optimization challenges—including loss weighting, gradient conflicts, and Pareto-front approaches—and surveys task-grouping and partially supervised techniques (self-supervised, semi-supervised, few-shot) to improve data efficiency. The authors synthesize datasets, tools, and benchmarking results to provide practical guidance for building scalable, data-efficient MTL CV systems. Overall, partially supervised MTL can match or exceed fully supervised performance in many settings, emphasizing the importance of task relationships, adaptive balancing, and comprehensive benchmarking. The work highlights future opportunities in large-scale task pools, adaptive sharing mechanisms, and cross-task learning in diverse CV domains.$

Abstract

Multi-Task Learning (MTL) aims to learn multiple tasks simultaneously while exploiting their mutual relationships. By using shared resources to simultaneously calculate multiple outputs, this learning paradigm has the potential to have lower memory requirements and inference times compared to the traditional approach of using separate methods for each task. Previous work in MTL has mainly focused on fully-supervised methods, as task relationships can not only be leveraged to lower the level of data-dependency of those methods but they can also improve performance. However, MTL introduces a set of challenges due to a complex optimisation scheme and a higher labeling requirement. This review focuses on how MTL could be utilised under different partial supervision settings to address these challenges. First, this review analyses how MTL traditionally uses different parameter sharing techniques to transfer knowledge in between tasks. Second, it presents the different challenges arising from such a multi-objective optimisation scheme. Third, it introduces how task groupings can be achieved by analysing task relationships. Fourth, it focuses on how partially supervised methods applied to MTL can tackle the aforementioned challenges. Lastly, this review presents the available datasets, tools and benchmarking results of such methods.
Paper Structure (36 sections, 43 equations, 15 figures, 6 tables, 1 algorithm)

This paper contains 36 sections, 43 equations, 15 figures, 6 tables, 1 algorithm.

Figures (15)

  • Figure 1: Overview of the literature review structure. Firstly, we introduce Multi-Task Parameter Sharing in \ref{['chapter:MT-parameter-sharing']}. Secondly, we review Optimisation Challenges in \ref{['chapter:Optimisation']}. Thirdly, we review how task relationships can be be used to group them in \ref{['sec:task-grouping']}. Finally, we introduce, in \ref{['chapter:partial-supervision']}, the different partially-supervised computer vision methods in MTL.
  • Figure 2: Overview of the different reviews on Multi-Task Learning.
  • Figure 3: Multi-Task Learning has mainly been divided into two architectural design schemes. Hard-parameter sharing (top) splits a shared backbone into task-specific heads which receives input from from the same set of features. Soft-parameter sharing (bottom) uses task-specific networks, but allows information to be shared between them.
  • Figure 4: Two task-specific CNN models CNN$_{i}$ and CNN$_{j}$. The NDDR-layer NDDR first concatenates the representations of the respective convolutional blocks. $1 \times 1$ convolutions are then run on this concatenation, one per task. Last, after batch normalisation, the features are propagated on to the next convolutional block of each model.
  • Figure 5: Visualisation of the different gradient update methods in MTL. The blue arrows represent the projections of the task-specific gradient update noted as $g_{1}$, $g_{2}$ and $g_{3}$. The red arrow represents the aggregated gradient update.
  • ...and 10 more figures