Table of Contents
Fetching ...

NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks

Fangzhou Lin, Yuping Wang, Yuliang Guo, Zixun Huang, Xinyu Huang, Haichong Zhang, Kazunori Yamada, Zhengzhong Tu, Liu Ren, Ziming Zhang

TL;DR

This work tackles Partially Supervised Multi-Task Learning when tasks are structurally different and supervision is domain-partitioned. It introduces NexusFlow, a plug-and-play framework that uses per-task surrogate modules with invertible coupling layers to map task features into a shared latent space for distribution alignment, preserving information and avoiding collapse. The approach is theoretically justified via a Lipschitz-based bound and empirically validated on nuScenes (domain-partitioned autonomous driving) and NYU-V2 (multi-task indoor perception), showing improved cross-task transfer and state-of-the-art performance under PS-MTL. The results demonstrate broad applicability, improved alignment of heterogeneous tasks, and efficiency advantages over prior PS-MTL methods designed for homogeneous tasks.

Abstract

Partially Supervised Multi-Task Learning (PS-MTL) aims to leverage knowledge across tasks when annotations are incomplete. Existing approaches, however, have largely focused on the simpler setting of homogeneous, dense prediction tasks, leaving the more realistic challenge of learning from structurally diverse tasks unexplored. To this end, we introduce NexusFlow, a novel, lightweight, and plug-and-play framework effective in both settings. NexusFlow introduces a set of surrogate networks with invertible coupling layers to align the latent feature distributions of tasks, creating a unified representation that enables effective knowledge transfer. The coupling layers are bijective, preserving information while mapping features into a shared canonical space. This invertibility avoids representational collapse and enables alignment across structurally different tasks without reducing expressive capacity. We first evaluate NexusFlow on the core challenge of domain-partitioned autonomous driving, where dense map reconstruction and sparse multi-object tracking are supervised in different geographic regions, creating both structural disparity and a strong domain gap. NexusFlow sets a new state-of-the-art result on nuScenes, outperforming strong partially supervised baselines. To demonstrate generality, we further test NexusFlow on NYUv2 using three homogeneous dense prediction tasks, segmentation, depth, and surface normals, as a representative N-task PS-MTL scenario. NexusFlow yields consistent gains across all tasks, confirming its broad applicability.

NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks

TL;DR

This work tackles Partially Supervised Multi-Task Learning when tasks are structurally different and supervision is domain-partitioned. It introduces NexusFlow, a plug-and-play framework that uses per-task surrogate modules with invertible coupling layers to map task features into a shared latent space for distribution alignment, preserving information and avoiding collapse. The approach is theoretically justified via a Lipschitz-based bound and empirically validated on nuScenes (domain-partitioned autonomous driving) and NYU-V2 (multi-task indoor perception), showing improved cross-task transfer and state-of-the-art performance under PS-MTL. The results demonstrate broad applicability, improved alignment of heterogeneous tasks, and efficiency advantages over prior PS-MTL methods designed for homogeneous tasks.

Abstract

Partially Supervised Multi-Task Learning (PS-MTL) aims to leverage knowledge across tasks when annotations are incomplete. Existing approaches, however, have largely focused on the simpler setting of homogeneous, dense prediction tasks, leaving the more realistic challenge of learning from structurally diverse tasks unexplored. To this end, we introduce NexusFlow, a novel, lightweight, and plug-and-play framework effective in both settings. NexusFlow introduces a set of surrogate networks with invertible coupling layers to align the latent feature distributions of tasks, creating a unified representation that enables effective knowledge transfer. The coupling layers are bijective, preserving information while mapping features into a shared canonical space. This invertibility avoids representational collapse and enables alignment across structurally different tasks without reducing expressive capacity. We first evaluate NexusFlow on the core challenge of domain-partitioned autonomous driving, where dense map reconstruction and sparse multi-object tracking are supervised in different geographic regions, creating both structural disparity and a strong domain gap. NexusFlow sets a new state-of-the-art result on nuScenes, outperforming strong partially supervised baselines. To demonstrate generality, we further test NexusFlow on NYUv2 using three homogeneous dense prediction tasks, segmentation, depth, and surface normals, as a representative N-task PS-MTL scenario. NexusFlow yields consistent gains across all tasks, confirming its broad applicability.

Paper Structure

This paper contains 18 sections, 1 theorem, 9 equations, 7 figures, 8 tables.

Key Result

Lemma 1

Let $h'_{1}, h'_{2} \in \mathbb{R}^N$ be the compact features of two tasks $t_1$ and $t_2$ passed into their coupling layers $c_{1}$ and $c_{2}$. Assume their inverse transformations $c_{1}^{-1}$ and $c_{2}^{-1}$ are $L$-Lipschitz continuous with constant $L$virmaux2018lipschitzgouk2021regularisatio where $\delta$ denotes the maximum structural discrepancy between the two inverse transformations o

Figures (7)

  • Figure 1: Problem Setup & Motivation. We illustrate the setup of Partially Supervised Multi-Task Learning using autonomous driving as a representative example. (a) In the ideal case, all training data are fully annotated for all tasks (e.g., tracking and mapping), and mixed training achieves the upper bound of Multi-Task Learning (MTL) performance. (b) In practice, however, collaborative communities often provide large amounts of valuable but single-task-oriented datasets. These datasets usually differ in both task labels and domains (e.g., geographic or scene-level domain gaps). Naively mixing them leads to incomplete supervision, causing degraded performance and poor cross-domain generalization compared with fully supervised MTL. (c) Our goal is to bridge both task and data domain gaps using a simple yet effective migration strategy in the latent space.
  • Figure 2: NexusFlow Pipeline. A simple yet general framework that scales to the $N$-task partially supervised MTL setting. Given a data batch where only Task 1 has annotations, we extract latent features $h_i$ from all $N$ task heads. These features are then encoded into a shared representation space $\{z_i\}$ via an invertible network, where the model minimizes the distance between each $z_i$ and their mean to promote cross-task consistency.
  • Figure 3: From left to right: t-SNE visualizations from the coupling layers of Baseline, MTPSL, JTR, and NexusFlow(Ours).
  • Figure 4: From left to right: t-SNE visualizations from the coupling layers of Baseline, MTPSL, and NexusFlow (Ours).
  • Figure 5: Figure of eigenvalue magnitudes decay. Slower decay indicate forging more complex and information-rich feature space.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Lemma 1: Bounded Feature Discrepancy
  • proof : Proof