Table of Contents
Fetching ...

Continuous Risk Prediction

Yi Dai

TL;DR

Diana presents a dynamic architecture-based lifelong QA model that combines task- and instance-level components with four hierarchically structured prompts to learn a sequence of QA tasks without test-time task identities. It introduces prompt key vectors and a triplet-based learning objective to enable explicit handling of unseen tasks and cross-task knowledge sharing. The method is trained in two stages and employs scheduled sampling to balance supervision. Empirical results on 11 QA benchmarks show state-of-the-art lifelong QA performance, especially in unseen-task generalization, demonstrating practical applicability for continual QA systems.

Abstract

Lifelong learning (LL) capabilities are essential for QA models to excel in real-world applications, and architecture-based LL approaches have proven to be a promising direction for achieving this goal. However, adapting existing methods to QA tasks is far from straightforward. Many prior approaches either rely on access to task identities during testing or fail to adequately model samples from unseen tasks, which limits their practical applicability. To overcome these limitations, we introduce Diana , a novel \underline{d}ynam\underline{i}c \underline{a}rchitecture-based lifelo\underline{n}g Q\underline{A} framework designed to learn a sequence of QA tasks using a prompt-enhanced language model.Diana leverages four hierarchically structured types of prompts to capture QA knowledge at multiple levels of granularity. Task-level prompts are specifically designed to encode task-specific knowledge, ensuring strong lifelong learning performance. Meanwhile, instance-level prompts are utilized to capture shared knowledge across diverse input samples, enhancing the model's generalization capabilities. Additionally, Diana incorporates dedicated prompts to explicitly handle unseen tasks and introduces a set of prompt key vectors that facilitate efficient knowledge transfer and sharing between tasks. Through extensive experimentation, we demonstrate that Diana achieves state-of-the-art performance among lifelong QA models, with particularly notable improvements in its ability to handle previously unseen tasks. This makes Diana a significant advancement in the field of lifelong learning for question-answering systems.

Continuous Risk Prediction

TL;DR

Diana presents a dynamic architecture-based lifelong QA model that combines task- and instance-level components with four hierarchically structured prompts to learn a sequence of QA tasks without test-time task identities. It introduces prompt key vectors and a triplet-based learning objective to enable explicit handling of unseen tasks and cross-task knowledge sharing. The method is trained in two stages and employs scheduled sampling to balance supervision. Empirical results on 11 QA benchmarks show state-of-the-art lifelong QA performance, especially in unseen-task generalization, demonstrating practical applicability for continual QA systems.

Abstract

Lifelong learning (LL) capabilities are essential for QA models to excel in real-world applications, and architecture-based LL approaches have proven to be a promising direction for achieving this goal. However, adapting existing methods to QA tasks is far from straightforward. Many prior approaches either rely on access to task identities during testing or fail to adequately model samples from unseen tasks, which limits their practical applicability. To overcome these limitations, we introduce Diana , a novel \underline{d}ynam\underline{i}c \underline{a}rchitecture-based lifelo\underline{n}g Q\underline{A} framework designed to learn a sequence of QA tasks using a prompt-enhanced language model.Diana leverages four hierarchically structured types of prompts to capture QA knowledge at multiple levels of granularity. Task-level prompts are specifically designed to encode task-specific knowledge, ensuring strong lifelong learning performance. Meanwhile, instance-level prompts are utilized to capture shared knowledge across diverse input samples, enhancing the model's generalization capabilities. Additionally, Diana incorporates dedicated prompts to explicitly handle unseen tasks and introduces a set of prompt key vectors that facilitate efficient knowledge transfer and sharing between tasks. Through extensive experimentation, we demonstrate that Diana achieves state-of-the-art performance among lifelong QA models, with particularly notable improvements in its ability to handle previously unseen tasks. This makes Diana a significant advancement in the field of lifelong learning for question-answering systems.

Paper Structure

This paper contains 14 sections, 3 equations, 1 table.