Table of Contents
Fetching ...

Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease

Giang Pham, Rebecca Finetti, Caterina Graziani, Bianca Roncaglia, Asma Bendjeddou, Linda Brodo, Sara Brunetti, Moreno Falaschi, Stefano Forti, Silvia Giulia Galfré, Paolo Milazzo, Corrado Priami, Annalisa Santucci, Ottavia Spiga, Alina Sîrbu

Abstract

Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic acid (HGA) in body fluids and tissues. This leads to systemic manifestations, including premature spondyloarthropathy, renal and prostatic stones, and cardiovascular complications. Being ultra-rare, the amount of data related to the disease is limited, both in terms of clinical data and literature. Knowledge graphs (KGs) can help connect the limited knowledge about the disease (basic mechanisms, manifestations and existing therapies) with other knowledge; however, AKU is frequently underrepresented or entirely absent in existing biomedical KGs. In this work, we apply a text-mining methodology based on PubTator3 for large-scale extraction of biomedical relations. We construct two KGs of different sizes, validate them using existing biochemical knowledge and use them to extract genes, diseases and therapies possibly related to AKU. This computational framework reveals the systemic interactions of the disease, its comorbidities, and potential therapeutic targets, demonstrating the efficacy of our approach in analyzing rare metabolic disorders.

Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease

Abstract

Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic acid (HGA) in body fluids and tissues. This leads to systemic manifestations, including premature spondyloarthropathy, renal and prostatic stones, and cardiovascular complications. Being ultra-rare, the amount of data related to the disease is limited, both in terms of clinical data and literature. Knowledge graphs (KGs) can help connect the limited knowledge about the disease (basic mechanisms, manifestations and existing therapies) with other knowledge; however, AKU is frequently underrepresented or entirely absent in existing biomedical KGs. In this work, we apply a text-mining methodology based on PubTator3 for large-scale extraction of biomedical relations. We construct two KGs of different sizes, validate them using existing biochemical knowledge and use them to extract genes, diseases and therapies possibly related to AKU. This computational framework reveals the systemic interactions of the disease, its comorbidities, and potential therapeutic targets, demonstrating the efficacy of our approach in analyzing rare metabolic disorders.
Paper Structure (5 sections, 10 figures)

This paper contains 5 sections, 10 figures.

Figures (10)

  • Figure 1: The high-confidence network: KG containing only relations that appear in at least two publications. An HTML view of the network is available at the https://giangpth.github.io/Alkaptonuria/visualizations/highconfidence.html.
  • Figure 2: Comparison between the two KGs and the gene–gene connections from STRING .
  • Figure 3: Comparison between our KGs and gene–drug connections from DGIdb.
  • Figure 4: Comparison between the KGs and the network derived from the tyrosine metabolism pathway in KEGG.
  • Figure 5: Distribution of node degrees and clustering coefficients
  • ...and 5 more figures