Table of Contents
Fetching ...

Lessons from Natural Language Inference in the Clinical Domain

Alexey Romanov, Chaitanya Shivade

TL;DR

The paper tackles the limited generalization of NLP models in the data-scarce clinical domain by introducing MedNLI, a clinician-annotated natural language inference dataset grounded in patient histories. It systematically evaluates open-domain transfer learning and domain-knowledge integration via medical ontologies, demonstrating that domain-specific embeddings and transfer learning yield consistent gains, while retrofit-based approaches can hurt performance. The work provides thorough baselines (feature-based, BOW, InferSent, ESIM), analyzes transfer settings and embeddings, and introduces knowledge-directed attention as a viable integration technique, complemented by a frank error analysis and discussion of limitations. By releasing MedNLI and accompanying code, the authors aim to accelerate clinical NLP and NLI research and its practical applications, such as trial eligibility and guideline compliance monitoring.

Abstract

State of the art models using deep neural networks have become very good in learning an accurate mapping from inputs to outputs. However, they still lack generalization capabilities in conditions that differ from the ones encountered during training. This is even more challenging in specialized, and knowledge intensive domains, where training data is limited. To address this gap, we introduce MedNLI - a dataset annotated by doctors, performing a natural language inference task (NLI), grounded in the medical history of patients. We present strategies to: 1) leverage transfer learning using datasets from the open domain, (e.g. SNLI) and 2) incorporate domain knowledge from external data and lexical sources (e.g. medical terminologies). Our results demonstrate performance gains using both strategies.

Lessons from Natural Language Inference in the Clinical Domain

TL;DR

The paper tackles the limited generalization of NLP models in the data-scarce clinical domain by introducing MedNLI, a clinician-annotated natural language inference dataset grounded in patient histories. It systematically evaluates open-domain transfer learning and domain-knowledge integration via medical ontologies, demonstrating that domain-specific embeddings and transfer learning yield consistent gains, while retrofit-based approaches can hurt performance. The work provides thorough baselines (feature-based, BOW, InferSent, ESIM), analyzes transfer settings and embeddings, and introduces knowledge-directed attention as a viable integration technique, complemented by a frank error analysis and discussion of limitations. By releasing MedNLI and accompanying code, the authors aim to accelerate clinical NLP and NLI research and its practical applications, such as trial eligibility and guideline compliance monitoring.

Abstract

State of the art models using deep neural networks have become very good in learning an accurate mapping from inputs to outputs. However, they still lack generalization capabilities in conditions that differ from the ones encountered during training. This is even more challenging in specialized, and knowledge intensive domains, where training data is limited. To address this gap, we introduce MedNLI - a dataset annotated by doctors, performing a natural language inference task (NLI), grounded in the medical history of patients. We present strategies to: 1) leverage transfer learning using datasets from the open domain, (e.g. SNLI) and 2) incorporate domain knowledge from external data and lexical sources (e.g. medical terminologies). Our results demonstrate performance gains using both strategies.

Paper Structure

This paper contains 28 sections, 1 equation, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Prompt shown to clinicians for annotations
  • Figure 2: Box plot of the distribution of sentence length in tokens in SNLI and MedNLI
  • Figure 3: ESIM model. Dashed blocks illustrate the knowledge-directed attention matrix and the corresponding vectors (see Section \ref{['sec:knowledge_directed_attention']} for details).
  • Figure 4: Schematic depiction of the model for multi-target transfer learning
  • Figure 5: Lengths of the shortest paths between concepts in the premise and the hypothesis. $0$ indicates that they contain the same concept.