Table of Contents
Fetching ...

T3: A Novel Zero-shot Transfer Learning Framework Iteratively Training on an Assistant Task for a Target Task

Xindi Tong, Yujin Zhu, Shijian Fan, Liang Xu

TL;DR

Long-text summarization faces data scarcity and the need to preserve extensive contextual details. The paper proposes T3, a zero-shot transfer framework that first trains a baseline LLM on an assistant task (QA) and iteratively distills useful experiences to transfer to the target summarization task, with QA and QG guiding the process. Across BBC Summary, NarraSum, NLQuAD, and FairytaleQA, T3 yields consistent improvements in ROUGE, BLEU, and Factscore over seven baselines, illustrating effective cross-task knowledge transfer. The approach is model-agnostic and extensible to other assistant-target task combinations, offering a practical pathway to enhance long-document processing with limited task-specific data.

Abstract

Long text summarization, gradually being essential for efficiently processing large volumes of information, stays challenging for Large Language Models (LLMs) such as GPT and LLaMA families because of the insufficient open-sourced training datasets and the high requirement of contextual details dealing. To address the issue, we design a novel zero-shot transfer learning framework, abbreviated as T3, to iteratively training a baseline LLM on an assistant task for the target task, where the former should own richer data resources and share structural or semantic similarity with the latter. In practice, T3 is approached to deal with the long text summarization task by utilizing question answering as the assistant task, and further validated its effectiveness on the BBC summary, NarraSum, FairytaleQA, and NLQuAD datasets, with up to nearly 14% improvement in ROUGE, 35% improvement in BLEU, and 16% improvement in Factscore compared to three baseline LLMs, demonstrating its potential for more assistant-target task combinations.

T3: A Novel Zero-shot Transfer Learning Framework Iteratively Training on an Assistant Task for a Target Task

TL;DR

Long-text summarization faces data scarcity and the need to preserve extensive contextual details. The paper proposes T3, a zero-shot transfer framework that first trains a baseline LLM on an assistant task (QA) and iteratively distills useful experiences to transfer to the target summarization task, with QA and QG guiding the process. Across BBC Summary, NarraSum, NLQuAD, and FairytaleQA, T3 yields consistent improvements in ROUGE, BLEU, and Factscore over seven baselines, illustrating effective cross-task knowledge transfer. The approach is model-agnostic and extensible to other assistant-target task combinations, offering a practical pathway to enhance long-document processing with limited task-specific data.

Abstract

Long text summarization, gradually being essential for efficiently processing large volumes of information, stays challenging for Large Language Models (LLMs) such as GPT and LLaMA families because of the insufficient open-sourced training datasets and the high requirement of contextual details dealing. To address the issue, we design a novel zero-shot transfer learning framework, abbreviated as T3, to iteratively training a baseline LLM on an assistant task for the target task, where the former should own richer data resources and share structural or semantic similarity with the latter. In practice, T3 is approached to deal with the long text summarization task by utilizing question answering as the assistant task, and further validated its effectiveness on the BBC summary, NarraSum, FairytaleQA, and NLQuAD datasets, with up to nearly 14% improvement in ROUGE, 35% improvement in BLEU, and 16% improvement in Factscore compared to three baseline LLMs, demonstrating its potential for more assistant-target task combinations.
Paper Structure (37 sections, 3 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 37 sections, 3 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Training and test process of T3 for summarization task.
  • Figure 2: Test process of T3 for summarization task on QA dataset.
  • Figure 3: General workflow of task-agnostic T3