Table of Contents
Fetching ...

Integrating knowledge bases to improve coreference and bridging resolution for the chemical domain

Pengcheng Lu, Massimo Poesio

TL;DR

The paper addresses coreference and bridging resolution in chemical patents by proposing a multi-task end-to-end model that integrates an external knowledge base via a knowledge fusion layer to produce knowledge-enriched span representations. Using a SpanBERT-based encoder, entity linking to UMLS, and a fusion mechanism where the knowledge-informed span embedding is $g_i = [s_i, FFNN_e(e_i)]$, the approach enhances both coreference and bridging tasks. Experimental results on ChEMU-REF 2021 show that SpanBERT with UMLS knowledge yields the best performance, and multi-task training plus chemical tokenization further boost efficiency and accuracy. This work demonstrates the practical value of external chemical knowledge in improving information extraction from chemical patents and related literature.

Abstract

Resolving coreference and bridging relations in chemical patents is important for better understanding the precise chemical process, where chemical domain knowledge is very critical. We proposed an approach incorporating external knowledge into a multi-task learning model for both coreference and bridging resolution in the chemical domain. The results show that integrating external knowledge can benefit both chemical coreference and bridging resolution.

Integrating knowledge bases to improve coreference and bridging resolution for the chemical domain

TL;DR

The paper addresses coreference and bridging resolution in chemical patents by proposing a multi-task end-to-end model that integrates an external knowledge base via a knowledge fusion layer to produce knowledge-enriched span representations. Using a SpanBERT-based encoder, entity linking to UMLS, and a fusion mechanism where the knowledge-informed span embedding is , the approach enhances both coreference and bridging tasks. Experimental results on ChEMU-REF 2021 show that SpanBERT with UMLS knowledge yields the best performance, and multi-task training plus chemical tokenization further boost efficiency and accuracy. This work demonstrates the practical value of external chemical knowledge in improving information extraction from chemical patents and related literature.

Abstract

Resolving coreference and bridging relations in chemical patents is important for better understanding the precise chemical process, where chemical domain knowledge is very critical. We proposed an approach incorporating external knowledge into a multi-task learning model for both coreference and bridging resolution in the chemical domain. The results show that integrating external knowledge can benefit both chemical coreference and bridging resolution.
Paper Structure (18 sections, 11 equations, 1 figure, 7 tables)

This paper contains 18 sections, 11 equations, 1 figure, 7 tables.

Figures (1)

  • Figure 1: The framework of our proposed model for chemical coreference and bridging resolution