Table of Contents
Fetching ...

Clinical Trials Ontology Engineering with Large Language Models

Berkan Çakır

TL;DR

The paper tackles the challenge of converting rapidly growing clinical-trial results into usable knowledge by integrating LLMs with a dedicated ontology-merging pipeline. It presents a per-trial ontology extraction workflow using GPT3.5, GPT4, and Llama3 (8b/70b), followed by a novel merging strategy that achieves $O(n)$ merge time and $O( abla n)$ lookup via a sorted synonym list, enabling scalable, real-time data integration. Through practical evaluation and OQuaRE-based quality assessment, the study finds that chained prompting generally improves information extraction, with GPT4 approaching human-level performance in some settings, though issues like missing prefixes can reduce validity. The work demonstrates meaningful cost and time savings over manual curation and discusses extrapolated large-scale implications, arguing for LLM-assisted clinical-trial ontology engineering as a practical path toward real-time medical knowledge integration.

Abstract

Managing clinical trial information is currently a significant challenge for the medical industry, as traditional methods are both time-consuming and costly. This paper proposes a simple yet effective methodology to extract and integrate clinical trial data in a cost-effective and time-efficient manner. Allowing the medical industry to stay up-to-date with medical developments. Comparing time, cost, and quality of the ontologies created by humans, GPT3.5, GPT4, and Llama3 (8b & 70b). Findings suggest that large language models (LLM) are a viable option to automate this process both from a cost and time perspective. This study underscores significant implications for medical research where real-time data integration from clinical trials could become the norm.

Clinical Trials Ontology Engineering with Large Language Models

TL;DR

The paper tackles the challenge of converting rapidly growing clinical-trial results into usable knowledge by integrating LLMs with a dedicated ontology-merging pipeline. It presents a per-trial ontology extraction workflow using GPT3.5, GPT4, and Llama3 (8b/70b), followed by a novel merging strategy that achieves merge time and lookup via a sorted synonym list, enabling scalable, real-time data integration. Through practical evaluation and OQuaRE-based quality assessment, the study finds that chained prompting generally improves information extraction, with GPT4 approaching human-level performance in some settings, though issues like missing prefixes can reduce validity. The work demonstrates meaningful cost and time savings over manual curation and discusses extrapolated large-scale implications, arguing for LLM-assisted clinical-trial ontology engineering as a practical path toward real-time medical knowledge integration.

Abstract

Managing clinical trial information is currently a significant challenge for the medical industry, as traditional methods are both time-consuming and costly. This paper proposes a simple yet effective methodology to extract and integrate clinical trial data in a cost-effective and time-efficient manner. Allowing the medical industry to stay up-to-date with medical developments. Comparing time, cost, and quality of the ontologies created by humans, GPT3.5, GPT4, and Llama3 (8b & 70b). Findings suggest that large language models (LLM) are a viable option to automate this process both from a cost and time perspective. This study underscores significant implications for medical research where real-time data integration from clinical trials could become the norm.

Paper Structure

This paper contains 18 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: High-overview of proposed methodology.
  • Figure 2: NOCOnto metric across different LLMs and or used techniques. NOCOnto(normalised) takes into account the differences in ontologies not being included due to syntax errors (See Table 1, column "included ontologies"). Note that GPT4v4 results are unknown and thus not included.
  • Figure 3: All OQuaRE metrics across different LLMs and or used techniques. Note that GPT4v4 results are unknown and thus not included. Also note that NOCOnto OQuaRE metric has been separated due to formatting.