Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language

Liam Hazan; Gili Focht; Naama Gavrielov; Roi Reichart; Talar Hagopian; Mary-Louise C. Greer; Ruth Cytter Kuint; Dan Turner; Moti Freiman

Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language

Liam Hazan, Gili Focht, Naama Gavrielov, Roi Reichart, Talar Hagopian, Mary-Louise C. Greer, Ruth Cytter Kuint, Dan Turner, Moti Freiman

TL;DR

The paper tackles the problem of extracting structured information from Hebrew Crohn's disease radiology reports, where data imbalance and language resources hinder traditional NLP. It introduces SMP-BERT, a prompt-learning model that pre-trains on a Section Matching Prediction task to connect the Findings and Impression sections, enabling zero-shot inference and data-efficient fine-tuning. In experiments with ~9.7k Hebrew reports, SMP-BERT + tuning achieves a median AUC of 0.99 and median F1 of 0.84, vastly outperforming standard fine-tuning and zero-shot variants, especially for rare phenotypes. The work demonstrates that prompt-learning approaches can deliver high-accuracy information extraction in low-resource languages, with meaningful implications for scalable AI-assisted Crohn's disease diagnostics and broader healthcare NLP.

Abstract

Automatic conversion of free-text radiology reports into structured data using Natural Language Processing (NLP) techniques is crucial for analyzing diseases on a large scale. While effective for tasks in widely spoken languages like English, generative large language models (LLMs) typically underperform with less common languages and can pose potential risks to patient privacy. Fine-tuning local NLP models is hindered by the skewed nature of real-world medical datasets, where rare findings represent a significant data imbalance. We introduce SMP-BERT, a novel prompt learning method that leverages the structured nature of reports to overcome these challenges. In our studies involving a substantial collection of Crohn's disease radiology reports in Hebrew (over 8,000 patients and 10,000 reports), SMP-BERT greatly surpassed traditional fine-tuning methods in performance, notably in detecting infrequent conditions (AUC: 0.99 vs 0.94, F1: 0.84 vs 0.34). SMP-BERT empowers more accurate AI diagnostics available for low-resource languages.

Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language

TL;DR

Abstract

Paper Structure (13 sections, 1 equation, 7 figures, 2 tables)

This paper contains 13 sections, 1 equation, 7 figures, 2 tables.

Introduction
Related Work
Radiology Reports Information Extraction
Prompt Learning
SMP-BERT Framework
Section Matching Prediction
Inference with SMP-BERT
SMP-tuning
Experiments
Data
Experimental Setup
Results
Discussion

Figures (7)

Figure 1: Comparison of the median AUC and F1-score of three models (Standard Fine-tuning, SMP-BERT Zero-Shot, and SMP-BERT + tuning) over all phenotypes with 10+ positives. Error bars represent the Interquartile Range (IQR).
Figure 2: Example of SMP-BERT Input and Output. A medical radiology report section relevant to a patient's CD diagnosis. The section labeled "Findings" serves as the input for the SMP-BERT model, similar to its pre-training phase.
Figure 3: SMP-BERT Methodology - This figure illustrates three pre-training tasks and how they can be used for text classification through prompt learning. Using MLM (token-level) for inference requires "cloze question" prompts and a verbalizer function to convert labels into single-token answers (e.g., "positive"/"negative"). Using NSP (sentence-level) is more simple. While it allows prompts of varying lengths, it's still limited to single-sentence classification. Our novel SMP solves it by pre-training on matching whole sections (multiple sentence level). Then, replace the "Impression" section with a prompt about the presence/absence of a finding.
Figure 4: SMP-tuning - Fine-tuning SMP-BERT by generating a negative and a positive instance for every annotated sample and every label. The true label is "There is finding ..." so the negative instance is paired with "There is not finding ..."
Figure 5: Flowchart of study design - The flowchart outlines the sequence of processing steps from data acquisition to model evaluation. It visualizes the progression from the initial collection of MRI and CT Hebrew radiology reports, through the stages of manual annotation and multi-label stratification, culminating in the pre-training/training of the different models.
...and 2 more figures

Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language

TL;DR

Abstract

Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language

Authors

TL;DR

Abstract

Table of Contents

Figures (7)