Table of Contents
Fetching ...

EnriCo: Enriched Representation and Globally Constrained Inference for Entity and Relation Extraction

Urchade Zaratiana, Nadi Tomeh, Yann Dauxais, Pierre Holat, Thierry Charnois

TL;DR

EnriCo addresses limitations in joint entity and relation extraction by coupling attention-driven, richly enriched span and relation representations with constraint-aware decoding. It introduces a Filter and Refine mechanism to prune candidates and to update representations, and employs an Entity-Relation bias scheme together with ASP-based decoding to enforce dataset-specific constraints. Experimental results on ACE 05, CoNLL04, and SciERC show competitive entity and relation F1 scores, with constrained decoding offering gains and the fast Entity-First variant delivering substantial speedups. This approach advances knowledge graph construction by producing coherent, well-structured extractions aligned with domain rules.

Abstract

Joint entity and relation extraction plays a pivotal role in various applications, notably in the construction of knowledge graphs. Despite recent progress, existing approaches often fall short in two key aspects: richness of representation and coherence in output structure. These models often rely on handcrafted heuristics for computing entity and relation representations, potentially leading to loss of crucial information. Furthermore, they disregard task and/or dataset-specific constraints, resulting in output structures that lack coherence. In our work, we introduce EnriCo, which mitigates these shortcomings. Firstly, to foster rich and expressive representation, our model leverage attention mechanisms that allow both entities and relations to dynamically determine the pertinent information required for accurate extraction. Secondly, we introduce a series of decoding algorithms designed to infer the highest scoring solutions while adhering to task and dataset-specific constraints, thus promoting structured and coherent outputs. Our model demonstrates competitive performance compared to baselines when evaluated on Joint IE datasets.

EnriCo: Enriched Representation and Globally Constrained Inference for Entity and Relation Extraction

TL;DR

EnriCo addresses limitations in joint entity and relation extraction by coupling attention-driven, richly enriched span and relation representations with constraint-aware decoding. It introduces a Filter and Refine mechanism to prune candidates and to update representations, and employs an Entity-Relation bias scheme together with ASP-based decoding to enforce dataset-specific constraints. Experimental results on ACE 05, CoNLL04, and SciERC show competitive entity and relation F1 scores, with constrained decoding offering gains and the fast Entity-First variant delivering substantial speedups. This approach advances knowledge graph construction by producing coherent, well-structured extractions aligned with domain rules.

Abstract

Joint entity and relation extraction plays a pivotal role in various applications, notably in the construction of knowledge graphs. Despite recent progress, existing approaches often fall short in two key aspects: richness of representation and coherence in output structure. These models often rely on handcrafted heuristics for computing entity and relation representations, potentially leading to loss of crucial information. Furthermore, they disregard task and/or dataset-specific constraints, resulting in output structures that lack coherence. In our work, we introduce EnriCo, which mitigates these shortcomings. Firstly, to foster rich and expressive representation, our model leverage attention mechanisms that allow both entities and relations to dynamically determine the pertinent information required for accurate extraction. Secondly, we introduce a series of decoding algorithms designed to infer the highest scoring solutions while adhering to task and dataset-specific constraints, thus promoting structured and coherent outputs. Our model demonstrates competitive performance compared to baselines when evaluated on Joint IE datasets.
Paper Structure (38 sections, 16 equations, 6 figures, 9 tables)

This paper contains 38 sections, 16 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: The model consists of three key components: (1) Word Representation, responsible for computing word embeddings for each word in the input sentence. (2) Entity Classification Module, which calculates, prunes, enriches span representations, and classifies them. (3) Relation Classification Module, which similarly calculates, prunes, enriches span representations, and classifies them. The pruning and enrichment of entity and relation representations are performed by a "Filter and Refine" layer, as described in Section \ref{['sec:refine']} and illustrated in Figure \ref{['fig:filter']}.
  • Figure 2: Filter and Refine. This layer processes either span or relation representations. It first computes a ranking score for each span or relation, selecting those with the highest top-k values. The selected spans or relations are then passed through a "Read & Process" layer.
  • Figure 3: Biases value (Sec. \ref{['sec:learned_bias']}) for ACE 05 dataset. This figure shows the values of learned biases for different associations of entity and relation types. (left) $\bm{\phi}(h, r)$, bias scores between head entity type and relation type. (middle) $\bm{\phi}(t, r)$, bias scores between tail entity type and relation type. (right) $\bm{\phi}(h, t)$, bias scores between head entity type and tail entity type.
  • Figure 4: Attention visualization. This illustrates the attention scores of candidate entities and candidate relations within the input sequence, averaged across attention heads.
  • Figure 5: This figure illustrate the highest scoring relation type for each pairs of entity types.
  • ...and 1 more figures