Table of Contents
Fetching ...

Demo: Interactive Visualization of Semantic Relationships in a Biomedical Project's Talent Knowledge Graph

Jiawei Xu, Zhandos Sembay, Swathi Thaker, Pamela Payne-Foster, Jake Yue Chen, Ying Ding

TL;DR

This work presents an interactive visualization of the CM4AI Talent Knowledge Graph (TKG), a semantic space encompassing roughly $28{,}000$ researchers and $1{,}179$ biomedical datasets, organized in a 2D layout derived from transformer-based embeddings. The pipeline combines data from the PubMed Knowledge Graph and Semantic Scholar, uses Specter2 to generate $768$-dimensional embeddings, and identifies potential collaborators and dataset users via cosine similarity, with $30$ top collaborators per author and $150$ top users per dataset; GPT-4o provides justification for these recommendations. The visualization is implemented with PixiJS (WebGL), TypeScript, and Svelte, employing $t$-SNE and $UMAP$ for dimensionality reduction to produce the 2D embedding, and integrates detailed profiles via Oracle APEX. This framework enables GenAI-driven explanations and interactive exploration, and is adaptable to other biomedical knowledge graphs, addressing scalability limitations of traditional tools and supporting more informed collaboration and data usage in medical AI research.

Abstract

We present an interactive visualization of the Cell Map for AI Talent Knowledge Graph (CM4AI TKG), a detailed semantic space comprising approximately 28,000 experts and 1,000 datasets focused on the biomedical field. Our tool leverages transformer-based embeddings, WebGL visualization techniques, and generative AI, specifically Large Language Models (LLMs), to provide a responsive and user-friendly interface. This visualization supports the exploration of around 29,000 nodes, assisting users in identifying potential collaborators and dataset users within the health and biomedical research fields. Our solution transcends the limitations of conventional graph visualization tools like Gephi, particularly in handling large-scale interactive graphs. We utilize GPT-4o to furnish detailed justifications for recommended collaborators and dataset users, promoting informed decision-making. Key functionalities include responsive search and exploration, as well as GenAI-driven recommendations, all contributing to a nuanced representation of the convergence between biomedical and AI research landscapes. In addition to benefiting the Bridge2AI and CM4AI communities, this adaptable visualization framework can be extended to other biomedical knowledge graphs, fostering advancements in medical AI and healthcare innovation through improved user interaction and data exploration. The demonstration is available at: https://jiawei-alpha.vercel.app/.

Demo: Interactive Visualization of Semantic Relationships in a Biomedical Project's Talent Knowledge Graph

TL;DR

This work presents an interactive visualization of the CM4AI Talent Knowledge Graph (TKG), a semantic space encompassing roughly researchers and biomedical datasets, organized in a 2D layout derived from transformer-based embeddings. The pipeline combines data from the PubMed Knowledge Graph and Semantic Scholar, uses Specter2 to generate -dimensional embeddings, and identifies potential collaborators and dataset users via cosine similarity, with top collaborators per author and top users per dataset; GPT-4o provides justification for these recommendations. The visualization is implemented with PixiJS (WebGL), TypeScript, and Svelte, employing -SNE and for dimensionality reduction to produce the 2D embedding, and integrates detailed profiles via Oracle APEX. This framework enables GenAI-driven explanations and interactive exploration, and is adaptable to other biomedical knowledge graphs, addressing scalability limitations of traditional tools and supporting more informed collaboration and data usage in medical AI research.

Abstract

We present an interactive visualization of the Cell Map for AI Talent Knowledge Graph (CM4AI TKG), a detailed semantic space comprising approximately 28,000 experts and 1,000 datasets focused on the biomedical field. Our tool leverages transformer-based embeddings, WebGL visualization techniques, and generative AI, specifically Large Language Models (LLMs), to provide a responsive and user-friendly interface. This visualization supports the exploration of around 29,000 nodes, assisting users in identifying potential collaborators and dataset users within the health and biomedical research fields. Our solution transcends the limitations of conventional graph visualization tools like Gephi, particularly in handling large-scale interactive graphs. We utilize GPT-4o to furnish detailed justifications for recommended collaborators and dataset users, promoting informed decision-making. Key functionalities include responsive search and exploration, as well as GenAI-driven recommendations, all contributing to a nuanced representation of the convergence between biomedical and AI research landscapes. In addition to benefiting the Bridge2AI and CM4AI communities, this adaptable visualization framework can be extended to other biomedical knowledge graphs, fostering advancements in medical AI and healthcare innovation through improved user interaction and data exploration. The demonstration is available at: https://jiawei-alpha.vercel.app/.
Paper Structure (7 sections, 2 figures)

This paper contains 7 sections, 2 figures.

Figures (2)

  • Figure 1: Information windows for different items: (a) a talent, (b) a dataset.
  • Figure 2: LLM's Justifications for Recommendations: (a) For Trey Ideker, (b) For CRISPR Screening Data