Table of Contents
Fetching ...

BioCOMPASS: Integrating Biomarkers into Transformer-Based Immunotherapy Response Prediction

Sayed Hashim, Frank Soboczenski, Paul Cairns

Abstract

Datasets used in immunotherapy response prediction are typically small in size, as well as diverse in cancer type, drug administered, and sequencer used. Models often drop in performance when tested on patient cohorts that are not included in the training process. Recent work has shown that transformer-based models along with self-supervised learning show better generalisation performance than threshold-based biomarkers, but is still suboptimal. We present BioCOMPASS, an extension of a transformer-based model called COMPASS, that integrates biomarkers and treatment information to further improve its generalisability. Instead of feeding biomarker data as input, we built loss components to align them with the model's intermediate representations. We found that components such as treatment gating and pathway consistency loss improved generalisability when evaluated with Leave-one-cohort-out, Leave-one-cancer-type-out and Leave-one-treatment-out strategies. Results show that building components that exploit biomarker and treatment information can help in generalisability of immunotherapy response prediction. Careful curation of additional components that leverage complementary clinical information and domain knowledge represents a promising direction for future research.

BioCOMPASS: Integrating Biomarkers into Transformer-Based Immunotherapy Response Prediction

Abstract

Datasets used in immunotherapy response prediction are typically small in size, as well as diverse in cancer type, drug administered, and sequencer used. Models often drop in performance when tested on patient cohorts that are not included in the training process. Recent work has shown that transformer-based models along with self-supervised learning show better generalisation performance than threshold-based biomarkers, but is still suboptimal. We present BioCOMPASS, an extension of a transformer-based model called COMPASS, that integrates biomarkers and treatment information to further improve its generalisability. Instead of feeding biomarker data as input, we built loss components to align them with the model's intermediate representations. We found that components such as treatment gating and pathway consistency loss improved generalisability when evaluated with Leave-one-cohort-out, Leave-one-cancer-type-out and Leave-one-treatment-out strategies. Results show that building components that exploit biomarker and treatment information can help in generalisability of immunotherapy response prediction. Careful curation of additional components that leverage complementary clinical information and domain knowledge represents a promising direction for future research.

Paper Structure

This paper contains 24 sections, 4 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: BioCOMPASS architecture: Gene expression data is first fed into the COMPASS encoder to generate embeddings. Minimising the pathway consistency loss makes sure that the pathway scores predicted from embeddings are aligned with external pathway scores. The embeddings are then fed into the COMPASS concept bottleneck to generate 44 biological concepts. These are aligned with cell-type biomarker scores using the concept alignment objective. They are also used to predict immunotherapy response prediction biomarkers such as TIDE & IPRES and other immune phenotypes. The concepts are also scaled based on the specific treatment type using the treatment gating module. The scaled concepts are then used to predict response using a classifier head. Components from COMPASS are in blue colour while BioCOMPASS components are in green.
  • Figure 2: Points show mean performance across four random seeds with 95% CI error bars. Circles: COMPASS; Squares: BioCOMPASS. BioCOMPASS shows consistent improvements in accuracy and ROC-AUC across most cohorts. Please note that the subplots have different scales on x-axis.
  • Figure 3: Points show mean performance across four random seeds with 95% CI error bars. Circles: COMPASS; Squares: BioCOMPASS. Please note that the subplots have different scales on x-axis.
  • Figure 4: Points show mean performance across four random seeds with 95% CI error bars. Circles: COMPASS; Squares: BioCOMPASS. Please note that the subplots have different scales on x-axis.