Table of Contents
Fetching ...

Generalized knowledge-enhanced framework for biomedical entity and relation extraction

Minh Nguyen, Phuong Le

TL;DR

The paper tackles the challenge of biomedical entity and relation extraction amid rapidly growing literature by introducing a generalized background knowledge graph (GK) framework with a General-Knowledge (GK) component and a Specific-Knowledge (SK) component that enables cross-task transfer. GK leverages external sources (UMLS and Wikidata) and a BioBERT-based masking strategy to derive graph-based representations that are reusable across tasks, while SK adapts to task-specific data and fuses with GK through a fusion graph convolutional mechanism. The approach achieves competitive results on ADE and BioRelEx, with ADE showing clear gains and GK contributing a meaningful, reusable set of nodes, reducing the need for task-specific KG retraining. This work demonstrates the potential of a universal knowledge base to improve efficiency and scalability in biomedical NLP and points to extensions into other domains beyond biomedicine.

Abstract

In recent years, there has been an increasing number of frameworks developed for biomedical entity and relation extraction. This research effort aims to address the accelerating growth in biomedical publications and the intricate nature of biomedical texts, which are written for mainly domain experts. To handle these challenges, we develop a novel framework that utilizes external knowledge to construct a task-independent and reusable background knowledge graph for biomedical entity and relation extraction. The design of our model is inspired by how humans learn domain-specific topics. In particular, humans often first acquire the most basic and common knowledge regarding a field to build the foundational knowledge and then use that as a basis for extending to various specialized topics. Our framework employs such common-knowledge-sharing mechanism to build a general neural-network knowledge graph that is learning transferable to different domain-specific biomedical texts effectively. Experimental evaluations demonstrate that our model, equipped with this generalized and cross-transferable knowledge base, achieves competitive performance benchmarks, including BioRelEx for binding interaction detection and ADE for Adverse Drug Effect identification.

Generalized knowledge-enhanced framework for biomedical entity and relation extraction

TL;DR

The paper tackles the challenge of biomedical entity and relation extraction amid rapidly growing literature by introducing a generalized background knowledge graph (GK) framework with a General-Knowledge (GK) component and a Specific-Knowledge (SK) component that enables cross-task transfer. GK leverages external sources (UMLS and Wikidata) and a BioBERT-based masking strategy to derive graph-based representations that are reusable across tasks, while SK adapts to task-specific data and fuses with GK through a fusion graph convolutional mechanism. The approach achieves competitive results on ADE and BioRelEx, with ADE showing clear gains and GK contributing a meaningful, reusable set of nodes, reducing the need for task-specific KG retraining. This work demonstrates the potential of a universal knowledge base to improve efficiency and scalability in biomedical NLP and points to extensions into other domains beyond biomedicine.

Abstract

In recent years, there has been an increasing number of frameworks developed for biomedical entity and relation extraction. This research effort aims to address the accelerating growth in biomedical publications and the intricate nature of biomedical texts, which are written for mainly domain experts. To handle these challenges, we develop a novel framework that utilizes external knowledge to construct a task-independent and reusable background knowledge graph for biomedical entity and relation extraction. The design of our model is inspired by how humans learn domain-specific topics. In particular, humans often first acquire the most basic and common knowledge regarding a field to build the foundational knowledge and then use that as a basis for extending to various specialized topics. Our framework employs such common-knowledge-sharing mechanism to build a general neural-network knowledge graph that is learning transferable to different domain-specific biomedical texts effectively. Experimental evaluations demonstrate that our model, equipped with this generalized and cross-transferable knowledge base, achieves competitive performance benchmarks, including BioRelEx for binding interaction detection and ADE for Adverse Drug Effect identification.
Paper Structure (15 sections, 2 equations, 4 figures, 4 tables)

This paper contains 15 sections, 2 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Extracting relation data process using BioBERT: (subject, relation, object) = (flu, has symptoms, fever).
  • Figure 2: Illustration of building specific task’s graph and connecting with GK.
  • Figure 3: Testing results of ADE using our models with source Wikidata over 19 epochs.
  • Figure 4: Example of relation weights for subject being disease meningitis.