Table of Contents
Fetching ...

BioBridge: Bridging Biomedical Foundation Models via Knowledge Graphs

Zifeng Wang, Zichen Wang, Balasubramaniam Srinivasan, Vassilis N. Ioannidis, Huzefa Rangwala, Rishita Anubhai

TL;DR

BioBridge addresses the challenge of unlocking multimodal reasoning for biomedical foundation models by introducing a KG-guided bridge that connects frozen unimodal embeddings. It learns a relation-aware additive transformation via a bridge module and modality projections, optimized with a contrastive InfoNCE objective $\mathcal{L}_{ij}$, while the base FMs remain fixed. The approach yields strong cross-modal retrieval, semantic alignment, and out-of-domain generalization, and supports retrieval-augmented multimodal generation for drug discovery prompts, achieving substantial improvements over traditional KG embeddings. This work offers a data-efficient path to multimodal biomedical AI and suggests extensions to connect pre-trained FMs across domains through knowledge graphs.

Abstract

Foundation models (FMs) are able to leverage large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained and used for tasks on protein sequences alone, small molecule structures alone, or clinical data alone. To overcome this limitation of biomedical FMs, we present BioBridge, a novel parameter-efficient learning framework, to bridge independently trained unimodal FMs to establish multimodal behavior. BioBridge achieves it by utilizing Knowledge Graphs (KG) to learn transformations between one unimodal FM and another without fine-tuning any underlying unimodal FMs. Our empirical results demonstrate that BioBridge can beat the best baseline KG embedding methods (on average by around 76.3%) in cross-modal retrieval tasks. We also identify BioBridge demonstrates out-of-domain generalization ability by extrapolating to unseen modalities or relations. Additionally, we also show that BioBridge presents itself as a general purpose retriever that can aid biomedical multimodal question answering as well as enhance the guided generation of novel drugs.

BioBridge: Bridging Biomedical Foundation Models via Knowledge Graphs

TL;DR

BioBridge addresses the challenge of unlocking multimodal reasoning for biomedical foundation models by introducing a KG-guided bridge that connects frozen unimodal embeddings. It learns a relation-aware additive transformation via a bridge module and modality projections, optimized with a contrastive InfoNCE objective , while the base FMs remain fixed. The approach yields strong cross-modal retrieval, semantic alignment, and out-of-domain generalization, and supports retrieval-augmented multimodal generation for drug discovery prompts, achieving substantial improvements over traditional KG embeddings. This work offers a data-efficient path to multimodal biomedical AI and suggests extensions to connect pre-trained FMs across domains through knowledge graphs.

Abstract

Foundation models (FMs) are able to leverage large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained and used for tasks on protein sequences alone, small molecule structures alone, or clinical data alone. To overcome this limitation of biomedical FMs, we present BioBridge, a novel parameter-efficient learning framework, to bridge independently trained unimodal FMs to establish multimodal behavior. BioBridge achieves it by utilizing Knowledge Graphs (KG) to learn transformations between one unimodal FM and another without fine-tuning any underlying unimodal FMs. Our empirical results demonstrate that BioBridge can beat the best baseline KG embedding methods (on average by around 76.3%) in cross-modal retrieval tasks. We also identify BioBridge demonstrates out-of-domain generalization ability by extrapolating to unseen modalities or relations. Additionally, we also show that BioBridge presents itself as a general purpose retriever that can aid biomedical multimodal question answering as well as enhance the guided generation of novel drugs.
Paper Structure (22 sections, 2 theorems, 2 equations, 2 figures, 21 tables)

This paper contains 22 sections, 2 theorems, 2 equations, 2 figures, 21 tables.

Key Result

Theorem 1

For any given pair of nodes $v_i, v_j \in {\mathcal{V}}$ of modality types $k_{v_i}, k_{v_j} \in \{1, \ldots K\}$ and with representations given by their appropriate neural networks $s_{v_i} \in S_{k_{v_i}}, s_{v_j} \in S_{k_{v_j}}$, which are connected by relation type $r_{ij} \in {\mathcal{R}}$, t

Figures (2)

  • Figure 1: The conceptual comparison between our methods and previous methods. Left: multimodal contrastive learning, e.g., CLIP, learns from a combination of paired data, updating all unimodal encoders; Middle: ImageBind aligns all modalities with the central modality, with only the central model frozen; Right: BioBRIDGE learns the transformation across modalities from a multi-modal KG, keeping all FMs frozen.
  • Figure 2: The overall workflow of BioBRIDGE: (1) top: we train a bridge module that transforms the head node embedding to the tail node space with contrastive learning. (2) bottom left: the trained bridge module enables cross-modal prediction through the similarity search. (3) bottom right: The bridge module enables multimodal prompting for retrieval-augmented generation.

Theorems & Definitions (3)

  • Theorem 1: Existence of a Bridge Module
  • Theorem 1: Existence of a Bridge Module
  • proof