Table of Contents
Fetching ...

Class-Incremental Few-Shot Event Detection

Kailin Zhao, Xiaolong Jin, Long Bai, Jiafeng Guo, Xueqi Cheng

TL;DR

This work introduces the Class-Incremental Few-Shot Event Detection (CIFSED) task to enable continual learning of new event classes from very few labeled examples while preserving performance on existing classes. It proposes Prompt-KD, a two-part method combining an attention-based multi-teacher knowledge distillation framework with prompt learning to counter forgetting of base knowledge and overfitting on new classes; the ancestor and father teacher models guide a student through successive learning sessions, with exemplars replay and an attention mechanism to balance their influence. A cloze-prompting strategy with a curriculum-based progression is integrated to address few-shot data scarcity and promote robust generalization. Experiments on FewEvent and MAVEN demonstrate that Prompt-KD consistently surpasses strong baselines and that its ablations confirm the effectiveness of KD, the two-teacher design, the attention mechanism, prompt learning, and curriculum learning, highlighting the practical potential for scalable, continual event detection systems.

Abstract

Event detection is one of the fundamental tasks in information extraction and knowledge graph. However, a realistic event detection system often needs to deal with new event classes constantly. These new classes usually have only a few labeled instances as it is time-consuming and labor-intensive to annotate a large number of unlabeled instances. Therefore, this paper proposes a new task, called class-incremental few-shot event detection. Nevertheless, this task faces two problems, i.e., old knowledge forgetting and new class overfitting. To solve these problems, this paper further presents a novel knowledge distillation and prompt learning based method, called Prompt-KD. Specifically, to handle the forgetting problem about old knowledge, Prompt-KD develops an attention based multi-teacher knowledge distillation framework, where the ancestor teacher model pre-trained on base classes is reused in all learning sessions, and the father teacher model derives the current student model via adaptation. On the other hand, in order to cope with the few-shot learning scenario and alleviate the corresponding new class overfitting problem, Prompt-KD is also equipped with a prompt learning mechanism. Extensive experiments on two benchmark datasets, i.e., FewEvent and MAVEN, demonstrate the superior performance of Prompt-KD.

Class-Incremental Few-Shot Event Detection

TL;DR

This work introduces the Class-Incremental Few-Shot Event Detection (CIFSED) task to enable continual learning of new event classes from very few labeled examples while preserving performance on existing classes. It proposes Prompt-KD, a two-part method combining an attention-based multi-teacher knowledge distillation framework with prompt learning to counter forgetting of base knowledge and overfitting on new classes; the ancestor and father teacher models guide a student through successive learning sessions, with exemplars replay and an attention mechanism to balance their influence. A cloze-prompting strategy with a curriculum-based progression is integrated to address few-shot data scarcity and promote robust generalization. Experiments on FewEvent and MAVEN demonstrate that Prompt-KD consistently surpasses strong baselines and that its ablations confirm the effectiveness of KD, the two-teacher design, the attention mechanism, prompt learning, and curriculum learning, highlighting the practical potential for scalable, continual event detection systems.

Abstract

Event detection is one of the fundamental tasks in information extraction and knowledge graph. However, a realistic event detection system often needs to deal with new event classes constantly. These new classes usually have only a few labeled instances as it is time-consuming and labor-intensive to annotate a large number of unlabeled instances. Therefore, this paper proposes a new task, called class-incremental few-shot event detection. Nevertheless, this task faces two problems, i.e., old knowledge forgetting and new class overfitting. To solve these problems, this paper further presents a novel knowledge distillation and prompt learning based method, called Prompt-KD. Specifically, to handle the forgetting problem about old knowledge, Prompt-KD develops an attention based multi-teacher knowledge distillation framework, where the ancestor teacher model pre-trained on base classes is reused in all learning sessions, and the father teacher model derives the current student model via adaptation. On the other hand, in order to cope with the few-shot learning scenario and alleviate the corresponding new class overfitting problem, Prompt-KD is also equipped with a prompt learning mechanism. Extensive experiments on two benchmark datasets, i.e., FewEvent and MAVEN, demonstrate the superior performance of Prompt-KD.
Paper Structure (28 sections, 11 equations, 3 figures, 8 tables)

This paper contains 28 sections, 11 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: The diagram of the Prompt-KD method.
  • Figure 2: The F1 score curves of the 5-way 3-shot and the 10-way 3-shot tasks on FewEvent and MAVEN.
  • Figure 3: The heat map of different weights of the teacher models on the 5-way 1-shot tasks on FewEvent (a) and MAVEN (b).