Table of Contents
Fetching ...

Few-Shot Domain Adaptation for Named-Entity Recognition via Joint Constrained k-Means and Subspace Selection

Ayoub Hammal, Benno Uthayasooriyar, Caio Corro

TL;DR

This paper tackles few-shot named-entity recognition by formulating domain adaptation as a weakly supervised, constrained clustering problem. It extends $k$-means with label supervision, cluster-size (ratio) constraints on the O tag, and joint discriminative subspace selection to learn well-separated prototypes for entity types using both a small labeled support and a large unlabeled target data pool. The authors introduce a deterministic, parameter-free training pipeline with a robust initialization and an efficient E-step, including a generalized eigenvalue solution for subspace learning. Empirically, the method achieves state-of-the-art results on several English NER benchmarks in both tag-set extension and domain-transfer scenarios, highlighting the value of unlabeled data and discriminative projections for few-shot adaptation.

Abstract

Named-entity recognition (NER) is a task that typically requires large annotated datasets, which limits its applicability across domains with varying entity definitions. This paper addresses few-shot NER, aiming to transfer knowledge to new domains with minimal supervision. Unlike previous approaches that rely solely on limited annotated data, we propose a weakly supervised algorithm that combines small labeled datasets with large amounts of unlabeled data. Our method extends the k-means algorithm with label supervision, cluster size constraints and domain-specific discriminative subspace selection. This unified framework achieves state-of-the-art results in few-shot NER on several English datasets.

Few-Shot Domain Adaptation for Named-Entity Recognition via Joint Constrained k-Means and Subspace Selection

TL;DR

This paper tackles few-shot named-entity recognition by formulating domain adaptation as a weakly supervised, constrained clustering problem. It extends -means with label supervision, cluster-size (ratio) constraints on the O tag, and joint discriminative subspace selection to learn well-separated prototypes for entity types using both a small labeled support and a large unlabeled target data pool. The authors introduce a deterministic, parameter-free training pipeline with a robust initialization and an efficient E-step, including a generalized eigenvalue solution for subspace learning. Empirically, the method achieves state-of-the-art results on several English NER benchmarks in both tag-set extension and domain-transfer scenarios, highlighting the value of unlabeled data and discriminative projections for few-shot adaptation.

Abstract

Named-entity recognition (NER) is a task that typically requires large annotated datasets, which limits its applicability across domains with varying entity definitions. This paper addresses few-shot NER, aiming to transfer knowledge to new domains with minimal supervision. Unlike previous approaches that rely solely on limited annotated data, we propose a weakly supervised algorithm that combines small labeled datasets with large amounts of unlabeled data. Our method extends the k-means algorithm with label supervision, cluster size constraints and domain-specific discriminative subspace selection. This unified framework achieves state-of-the-art results in few-shot NER on several English datasets.

Paper Structure

This paper contains 23 sections, 47 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Illustration of the E step with ratio constraints. In the two bipartite graphs, left (resp. right) nodes represents datapoints (resp. clusters). The left graph is the full graph, where edge weights indicate distances between nodes and clusters. By contracting the two sets of clusters, we obtain a new graph, on which we can run the E step with a ratio constraint for the contracted O cluster (ratio is set to $1/3$ in the example). Thick red edges indicate the optimal solution. Note that, without the ratio constraint, ${\bm{x}}^{(2)}$ would be assigned to the O cluster.
  • Figure 2: Illustration of the benefit of subspace selection. (top) Data in its original 2D space. We assume the constrained clustering results in two clusters: one containing the two black crosses and the other containing the two red circles. Let the green star be a test point. Intuitively, it should be classified in the black crosses cluster, however, it is closer to the other cluster centroid! (bottom) Data after projection in a 1D space. The test point is now correctly classified.

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3