Table of Contents
Fetching ...
Paper

Masked Omics Modeling for Multimodal Representation Learning across Histopathology and Molecular Profiles

Abstract

Self-supervised learning (SSL) has driven major advances in computational pathology by enabling the learning of rich representations from histopathology data. Yet, tissue analysis alone may fall short in capturing broader molecular complexity, as key complementary information resides in high-dimensional omics profiles such as transcriptomics, methylomics, and genomics. To address this gap, we introduce MORPHEUS, the first multimodal pre-training strategy that integrates histopathology images and multi-omics data within a shared transformer-based architecture. At its core, MORPHEUS relies on a novel masked omics modeling objective that encourages the model to learn meaningful cross-modal relationships. This yields a general-purpose pre-trained encoder that can be applied to histopathology alone or in combination with any subset of omics modalities. Beyond inference, MORPHEUS also supports flexible any-to-any omics reconstruction, enabling one or more omics profiles to be reconstructed from any modality subset that includes histopathology. Pre-trained on a large pan-cancer cohort, MORPHEUS shows substantial improvements over supervised and SSL baselines across diverse tasks and modality combinations. Together, these capabilities position it as a promising direction for the development of multimodal foundation models in oncology. Code is publicly available at https://github.com/Lucas-rbnt/MORPHEUS