Table of Contents
Fetching ...

Transfer Learning for the Prediction of Entity Modifiers in Clinical Text: Application to Opioid Use Disorder Case Detection

Abdullateef I. Almudaifer, Whitney Covington, JaMor Hairston, Zachary Deitch, Ankit Anand, Caleb M. Carroll, Estera Crisan, William Bradford, Lauren Walter, Eaton Ellen, Sue S. Feldman, John D. Osborne

TL;DR

The paper addresses predicting modifiers of clinical entities to preserve accurate semantic interpretation by introducing a multi-task Transformer framework. It leverages a BioBERT backbone with a separate classifier head per modifier and evaluates on the SemEval 2015 ShARe data and a new Opioid Use Disorder (OUD) dataset, using a two-sequence input to emphasize the entity and its context. The model achieves state-of-the-art performance on ShARe (weighted accuracy +1.1%, unweighted accuracy +1.7%, micro-F1 +10%) and demonstrates effective transfer of learned weights to the OUD domain, with domain-adaptive fine-tuning yielding additional gains. The results support the viability of transfer learning for clinical text modifiers, offering improved reliability for downstream information extraction in heterogeneous clinical corpora and potential for real-world deployment in OUD and related domains.

Abstract

Background: The semantics of entities extracted from a clinical text can be dramatically altered by modifiers, including entity negation, uncertainty, conditionality, severity, and subject. Existing models for determining modifiers of clinical entities involve regular expression or features weights that are trained independently for each modifier. Methods: We develop and evaluate a multi-task transformer architecture design where modifiers are learned and predicted jointly using the publicly available SemEval 2015 Task 14 corpus and a new Opioid Use Disorder (OUD) data set that contains modifiers shared with SemEval as well as novel modifiers specific for OUD. We evaluate the effectiveness of our multi-task learning approach versus previously published systems and assess the feasibility of transfer learning for clinical entity modifiers when only a portion of clinical modifiers are shared. Results: Our approach achieved state-of-the-art results on the ShARe corpus from SemEval 2015 Task 14, showing an increase of 1.1% on weighted accuracy, 1.7% on unweighted accuracy, and 10% on micro F1 scores. Conclusions: We show that learned weights from our shared model can be effectively transferred to a new partially matched data set, validating the use of transfer learning for clinical text modifiers

Transfer Learning for the Prediction of Entity Modifiers in Clinical Text: Application to Opioid Use Disorder Case Detection

TL;DR

The paper addresses predicting modifiers of clinical entities to preserve accurate semantic interpretation by introducing a multi-task Transformer framework. It leverages a BioBERT backbone with a separate classifier head per modifier and evaluates on the SemEval 2015 ShARe data and a new Opioid Use Disorder (OUD) dataset, using a two-sequence input to emphasize the entity and its context. The model achieves state-of-the-art performance on ShARe (weighted accuracy +1.1%, unweighted accuracy +1.7%, micro-F1 +10%) and demonstrates effective transfer of learned weights to the OUD domain, with domain-adaptive fine-tuning yielding additional gains. The results support the viability of transfer learning for clinical text modifiers, offering improved reliability for downstream information extraction in heterogeneous clinical corpora and potential for real-world deployment in OUD and related domains.

Abstract

Background: The semantics of entities extracted from a clinical text can be dramatically altered by modifiers, including entity negation, uncertainty, conditionality, severity, and subject. Existing models for determining modifiers of clinical entities involve regular expression or features weights that are trained independently for each modifier. Methods: We develop and evaluate a multi-task transformer architecture design where modifiers are learned and predicted jointly using the publicly available SemEval 2015 Task 14 corpus and a new Opioid Use Disorder (OUD) data set that contains modifiers shared with SemEval as well as novel modifiers specific for OUD. We evaluate the effectiveness of our multi-task learning approach versus previously published systems and assess the feasibility of transfer learning for clinical entity modifiers when only a portion of clinical modifiers are shared. Results: Our approach achieved state-of-the-art results on the ShARe corpus from SemEval 2015 Task 14, showing an increase of 1.1% on weighted accuracy, 1.7% on unweighted accuracy, and 10% on micro F1 scores. Conclusions: We show that learned weights from our shared model can be effectively transferred to a new partially matched data set, validating the use of transfer learning for clinical text modifiers
Paper Structure (27 sections, 4 equations, 2 figures, 6 tables)

This paper contains 27 sections, 4 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Overview our modifier predication model. The multi-task architecture contains a Classification head for each distinct modifier type. The single-task architecture has only a single head for the classification of one of the modifiers.
  • Figure 2: Overview of transfer learning. Thin arrows indicate training data flow, color-coded for the data source. Thick arrows indicate fine-tuning operations.