Table of Contents
Fetching ...

Automated Spinal MRI Labelling from Reports Using a Large Language Model

Robin Y. Park, Rhydian Windsor, Amir Jamaludin, Andrew Zisserman

TL;DR

A general pipeline to automate the extraction of labels from radiology reports using large language models is proposed, which is validated on spinal MRI reports and shows that the extracted labels can be used to train imaging models to classify the identified conditions in the accompanying MR scans.

Abstract

We propose a general pipeline to automate the extraction of labels from radiology reports using large language models, which we validate on spinal MRI reports. The efficacy of our labelling method is measured on five distinct conditions: spinal cancer, stenosis, spondylolisthesis, cauda equina compression and herniation. Using open-source models, our method equals or surpasses GPT-4 on a held-out set of reports. Furthermore, we show that the extracted labels can be used to train imaging models to classify the identified conditions in the accompanying MR scans. All classifiers trained using automated labels achieve comparable performance to models trained using scans manually annotated by clinicians. Code can be found at https://github.com/robinyjpark/AutoLabelClassifier.

Automated Spinal MRI Labelling from Reports Using a Large Language Model

TL;DR

A general pipeline to automate the extraction of labels from radiology reports using large language models is proposed, which is validated on spinal MRI reports and shows that the extracted labels can be used to train imaging models to classify the identified conditions in the accompanying MR scans.

Abstract

We propose a general pipeline to automate the extraction of labels from radiology reports using large language models, which we validate on spinal MRI reports. The efficacy of our labelling method is measured on five distinct conditions: spinal cancer, stenosis, spondylolisthesis, cauda equina compression and herniation. Using open-source models, our method equals or surpasses GPT-4 on a held-out set of reports. Furthermore, we show that the extracted labels can be used to train imaging models to classify the identified conditions in the accompanying MR scans. All classifiers trained using automated labels achieve comparable performance to models trained using scans manually annotated by clinicians. Code can be found at https://github.com/robinyjpark/AutoLabelClassifier.

Paper Structure

This paper contains 16 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Radiological report labelling pipeline: The prompt step formats the user inputs as shown in Figure \ref{['prompts']} to summarise the report based on the target condition. Based on this summary, we extract the binary label using the normalised scores from a chosen set of two unique tokens ("yes" and "no") in the vocabulary.
  • Figure 2: Model prompting strategies: The direct query method (left) asks the model to extract the label based on the report. The summary and query method (right) asks the model to generate a summary focused on the condition, which it uses as additional input to annotate the report. Words in bold indicate user inputs to be modified.
  • Figure 3: MRI classification network: SpineNetV2 is used to detect IVDs. Each IVD is encoded using ResNet18. For stenosis, we use a SVM to get a score per IVD. For cancer and spondylolisthesis, we aggregate IVD encodings and use NSK-SVM to get a score per scan.
  • Figure 4: Real example report from LumbarData with the summary and scores generated for Stenosis using our pipeline. Any dates, names or location information were removed from the report.
  • Figure 5: An example report from CancerData with section headers shown in bold.
  • ...and 3 more figures