Table of Contents
Fetching ...

MEXA-CTP: Mode Experts Cross-Attention for Clinical Trial Outcome Prediction

Yiqing Zhang, Xiaozhong Liu, Fabricio Murai

TL;DR

Clinical trial outcome prediction benefits from integrating heterogeneous inputs but has been constrained by reliance on expensive wet-lab data and human-biased model designs. MEXA-CTP proposes a lightweight, mode-expert cross-attention framework that fuses drug, disease, and eligibility criteria through stage-wise encoding, knowledge embedding, and cross-modal interactions, optimized with a Cauchy loss and contrastive learning. Across the TOP benchmark, MEXA-CTP achieves notable improvements over the previous SOTA, with average gains of 11.3% in F1, 12.2% in PR-AUC, and 2.5% in ROC-AUC, while ablations confirm the contribution of statement-level encoding, positional information, and the joint loss. This work offers a practical, data-efficient pathway to prioritize promising trials and can reduce development costs by enabling early, cross-domain insight without requiring costly biochemical data.

Abstract

Clinical trials are the gold standard for assessing the effectiveness and safety of drugs for treating diseases. Given the vast design space of drug molecules, elevated financial cost, and multi-year timeline of these trials, research on clinical trial outcome prediction has gained immense traction. Accurate predictions must leverage data of diverse modes such as drug molecules, target diseases, and eligibility criteria to infer successes and failures. Previous Deep Learning approaches for this task, such as HINT, often require wet lab data from synthesized molecules and/or rely on prior knowledge to encode interactions as part of the model architecture. To address these limitations, we propose a light-weight attention-based model, MEXA-CTP, to integrate readily-available multi-modal data and generate effective representations via specialized modules dubbed "mode experts", while avoiding human biases in model design. We optimize MEXA-CTP with the Cauchy loss to capture relevant interactions across modes. Our experiments on the Trial Outcome Prediction (TOP) benchmark demonstrate that MEXA-CTP improves upon existing approaches by, respectively, up to 11.3% in F1 score, 12.2% in PR-AUC, and 2.5% in ROC-AUC, compared to HINT. Ablation studies are provided to quantify the effectiveness of each component in our proposed method.

MEXA-CTP: Mode Experts Cross-Attention for Clinical Trial Outcome Prediction

TL;DR

Clinical trial outcome prediction benefits from integrating heterogeneous inputs but has been constrained by reliance on expensive wet-lab data and human-biased model designs. MEXA-CTP proposes a lightweight, mode-expert cross-attention framework that fuses drug, disease, and eligibility criteria through stage-wise encoding, knowledge embedding, and cross-modal interactions, optimized with a Cauchy loss and contrastive learning. Across the TOP benchmark, MEXA-CTP achieves notable improvements over the previous SOTA, with average gains of 11.3% in F1, 12.2% in PR-AUC, and 2.5% in ROC-AUC, while ablations confirm the contribution of statement-level encoding, positional information, and the joint loss. This work offers a practical, data-efficient pathway to prioritize promising trials and can reduce development costs by enabling early, cross-domain insight without requiring costly biochemical data.

Abstract

Clinical trials are the gold standard for assessing the effectiveness and safety of drugs for treating diseases. Given the vast design space of drug molecules, elevated financial cost, and multi-year timeline of these trials, research on clinical trial outcome prediction has gained immense traction. Accurate predictions must leverage data of diverse modes such as drug molecules, target diseases, and eligibility criteria to infer successes and failures. Previous Deep Learning approaches for this task, such as HINT, often require wet lab data from synthesized molecules and/or rely on prior knowledge to encode interactions as part of the model architecture. To address these limitations, we propose a light-weight attention-based model, MEXA-CTP, to integrate readily-available multi-modal data and generate effective representations via specialized modules dubbed "mode experts", while avoiding human biases in model design. We optimize MEXA-CTP with the Cauchy loss to capture relevant interactions across modes. Our experiments on the Trial Outcome Prediction (TOP) benchmark demonstrate that MEXA-CTP improves upon existing approaches by, respectively, up to 11.3% in F1 score, 12.2% in PR-AUC, and 2.5% in ROC-AUC, compared to HINT. Ablation studies are provided to quantify the effectiveness of each component in our proposed method.
Paper Structure (29 sections, 17 equations, 3 figures, 3 tables)

This paper contains 29 sections, 17 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (Left) Recent work fu2022hint requires wet lab data acquisition for pre-training drug encoders; uses human-designed connections to capture cross-domain interactions. (Right) MEXA-CTP generates domain-specific rich embeddings, filters out irrelevant information, and extracts cross-domain interactions via mode experts.
  • Figure 2: (Left) MEXA-CTP model comprises four stages (from bottom to top): (i) Encoding Modules generate initial embeddings for each mode, (ii) Knowledge Embedding Models further enrich the information, (iii) Mode Experts capture relationships between different modes, and (iv) Knowledge Compensation Module enhances the interactions. (Top Right) Example of how two mode experts interact to fuse information from molecule and disease domains; $u$, $s$ and $t$ indicate single tokens from the token set. (Bottom Right) We use a residual network for final outcome prediction. Further details are provided in Section \ref{['sec:method']}.
  • Figure 3: Token selection frequency per index, grouped by number of valid tokens $x$. Color indicates ratio between selection frequency and number of samples. For exclusion (resp. inclusion) criteria, columns $x < 4$ (resp. $x < 5$) excluded due to small number of samples. Columns $x = 1$ omitted since token is always selected.

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5