Table of Contents
Fetching ...

Fact-Checking Generative AI: Ontology-Driven Biological Graphs for Disease-Gene Link Verification

Ahmed Abdeen Hamed, Byung Suk Lee, Alessandro Crimi, Magdalena M. Misiak

TL;DR

This paper tackles the challenge of verifying factual content in ChatGPT-generated biomedical text by deploying ontology-driven biological graphs as a fact-checking framework. It builds literature-grounded and ChatGPT-derived disease-gene graphs from large PubMed data and simulated articles, then compares them under a closed-world assumption to quantify edge-level truth for aggregate disease-gene relationships. The main result shows a 70–86% edge-overlap accuracy across 10 graph pairs, illustrating promising reliability of AI-generated biomedical claims when properly grounded in ontologies and literature. The approach offers a pathway for trustworthy deployment of generative AI in biomedicine and outlines future directions such as expanding ontologies, targeting complex diseases, and potential model retraining or guided prompting to enhance factual grounding.

Abstract

Since the launch of various generative AI tools, scientists have been striving to evaluate their capabilities and contents, in the hope of establishing trust in their generative abilities. Regulations and guidelines are emerging to verify generated contents and identify novel uses. we aspire to demonstrate how ChatGPT claims are checked computationally using the rigor of network models. We aim to achieve fact-checking of the knowledge embedded in biological graphs that were contrived from ChatGPT contents at the aggregate level. We adopted a biological networks approach that enables the systematic interrogation of ChatGPT's linked entities. We designed an ontology-driven fact-checking algorithm that compares biological graphs constructed from approximately 200,000 PubMed abstracts with counterparts constructed from a dataset generated using the ChatGPT-3.5 Turbo model. In 10-samples of 250 randomly selected records a ChatGPT dataset of 1000 "simulated" articles , the fact-checking link accuracy ranged from 70% to 86%. This study demonstrated high accuracy of aggregate disease-gene links relationships found in ChatGPT-generated texts.

Fact-Checking Generative AI: Ontology-Driven Biological Graphs for Disease-Gene Link Verification

TL;DR

This paper tackles the challenge of verifying factual content in ChatGPT-generated biomedical text by deploying ontology-driven biological graphs as a fact-checking framework. It builds literature-grounded and ChatGPT-derived disease-gene graphs from large PubMed data and simulated articles, then compares them under a closed-world assumption to quantify edge-level truth for aggregate disease-gene relationships. The main result shows a 70–86% edge-overlap accuracy across 10 graph pairs, illustrating promising reliability of AI-generated biomedical claims when properly grounded in ontologies and literature. The approach offers a pathway for trustworthy deployment of generative AI in biomedicine and outlines future directions such as expanding ontologies, targeting complex diseases, and potential model retraining or guided prompting to enhance factual grounding.

Abstract

Since the launch of various generative AI tools, scientists have been striving to evaluate their capabilities and contents, in the hope of establishing trust in their generative abilities. Regulations and guidelines are emerging to verify generated contents and identify novel uses. we aspire to demonstrate how ChatGPT claims are checked computationally using the rigor of network models. We aim to achieve fact-checking of the knowledge embedded in biological graphs that were contrived from ChatGPT contents at the aggregate level. We adopted a biological networks approach that enables the systematic interrogation of ChatGPT's linked entities. We designed an ontology-driven fact-checking algorithm that compares biological graphs constructed from approximately 200,000 PubMed abstracts with counterparts constructed from a dataset generated using the ChatGPT-3.5 Turbo model. In 10-samples of 250 randomly selected records a ChatGPT dataset of 1000 "simulated" articles , the fact-checking link accuracy ranged from 70% to 86%. This study demonstrated high accuracy of aggregate disease-gene links relationships found in ChatGPT-generated texts.
Paper Structure (8 sections, 1 figure, 1 table, 1 algorithm)

This paper contains 8 sections, 1 figure, 1 table, 1 algorithm.

Figures (1)

  • Figure 1: shows two subfigures: (a) on the left, the Number of Nodes, and (b) on the right the Number of Edges comparisons of 10 chatGPT graphs against literature, respectively.