Table of Contents
Fetching ...

Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism

Xiangming Xi, Feng Gao, Jun Xu, Fangtai Guo, Tianlei Jin

TL;DR

The paper tackles the limitation of traditional MTL that relies on feature- or parameter-level relatedness by introducing output-level relatedness through a feedback mechanism that uses one task's outputs as posterior information for others. It models a dynamic, iterative MTL process augmented with amplifier blocks and a Gumbel-based gate to determine infusion points, along with a convergence loss to stabilize predictions across iterations. Empirical results on SLU benchmarks show improved performance across tasks, with ablations demonstrating the importance of both the convergence loss and the gating mechanism. The approach provides a practical pathway to harness cross-task output correlations, potentially enhancing MTL systems in domains requiring tightly coupled outputs.

Abstract

Multi-task learning (MTL) is a paradigm that simultaneously learns multiple tasks by sharing information at different levels, enhancing the performance of each individual task. While previous research has primarily focused on feature-level or parameter-level task relatedness, and proposed various model architectures and learning algorithms to improve learning performance, we aim to explore output-level task relatedness. This approach introduces a posteriori information into the model, considering that different tasks may produce correlated outputs with mutual influences. We achieve this by incorporating a feedback mechanism into MTL models, where the output of one task serves as a hidden feature for another task, thereby transforming a static MTL model into a dynamic one. To ensure the training process converges, we introduce a convergence loss that measures the trend of a task's outputs during each iteration. Additionally, we propose a Gumbel gating mechanism to determine the optimal projection of feedback signals. We validate the effectiveness of our method and evaluate its performance through experiments conducted on several baseline models in spoken language understanding.

Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism

TL;DR

The paper tackles the limitation of traditional MTL that relies on feature- or parameter-level relatedness by introducing output-level relatedness through a feedback mechanism that uses one task's outputs as posterior information for others. It models a dynamic, iterative MTL process augmented with amplifier blocks and a Gumbel-based gate to determine infusion points, along with a convergence loss to stabilize predictions across iterations. Empirical results on SLU benchmarks show improved performance across tasks, with ablations demonstrating the importance of both the convergence loss and the gating mechanism. The approach provides a practical pathway to harness cross-task output correlations, potentially enhancing MTL systems in domains requiring tightly coupled outputs.

Abstract

Multi-task learning (MTL) is a paradigm that simultaneously learns multiple tasks by sharing information at different levels, enhancing the performance of each individual task. While previous research has primarily focused on feature-level or parameter-level task relatedness, and proposed various model architectures and learning algorithms to improve learning performance, we aim to explore output-level task relatedness. This approach introduces a posteriori information into the model, considering that different tasks may produce correlated outputs with mutual influences. We achieve this by incorporating a feedback mechanism into MTL models, where the output of one task serves as a hidden feature for another task, thereby transforming a static MTL model into a dynamic one. To ensure the training process converges, we introduce a convergence loss that measures the trend of a task's outputs during each iteration. Additionally, we propose a Gumbel gating mechanism to determine the optimal projection of feedback signals. We validate the effectiveness of our method and evaluate its performance through experiments conducted on several baseline models in spoken language understanding.
Paper Structure (10 sections, 1 equation, 5 figures, 4 tables)

This paper contains 10 sections, 1 equation, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Illustration of the proposed feedback and Gumbel gating mechanisms.
  • Figure 2: Illustration of the SG model (a) and its counterpart with the proposed feedback mechanism (b). Blue blocks represent feedback amplifiers, green blocks represent Gumbel gating blocks, red circles indicate candidate infusion positions, and dashed lines indicate potential feedback routes.
  • Figure 3: Illustration of the CIT model (a) and its counterpart with the proposed feedback mechanism (b).
  • Figure 4: Illustration of the SLU-LM mode Liu2016h (a) and its counterpart with the proposed feedback mechanism (b). The blue, green, and orange dashed lines are the feedback routes originating from Task ID, SF, and NWP, respectively.
  • Figure 5: Performance comparison of CIT (ITER.) and CIT (ITER. + CONV.). For better visualization, we scale the steps between 0 and 3000 10x, With setting steps 828/1732 for Intent/ACC and 1995/3311 for Slot/F1, respectively.