Table of Contents
Fetching ...

A PBN-RL-XAI Framework for Discovering a "Hit-and-Run" Therapeutic Strategy in Melanoma

Zhonglin Liu

TL;DR

This work tackles innate resistance to anti-PD-1 therapy in metastatic melanoma by building dynamic Probabilistic Boolean Network models from patient transcriptomics and applying reinforcement learning to discover time-dependent intervention schedules. Explainable AI (SHAP) then mechanistically interprets the learned control policies, revealing a JUN/LOXL2 regulatory axis that rigidifies the resistant state. A non-obvious hit-and-run strategy emerges: a precisely timed 4-step LOXL2 inhibition yielding high in silico success, suggesting transient perturbations can unlock the network’s self-correcting dynamics. The framework offers a generalizable computational approach for uncovering non-intuitive, temporally precise therapeutic strategies in complex regulatory networks, with potential implications for optimizing combination immunotherapies across cancers.

Abstract

Innate resistance to anti-PD-1 immunotherapy remains a major clinical challenge in metastatic melanoma, with the underlying molecular networks being poorly understood. To address this, we constructed a dynamic Probabilistic Boolean Network model using transcriptomic data from patient tumor biopsies to elucidate the regulatory logic governing therapy response. We then employed a reinforcement learning agent to systematically discover optimal, multi-step therapeutic interventions and used explainable artificial intelligence to mechanistically interpret the agent's control policy. The analysis revealed that a precisely timed, 4-step temporary inhibition of the lysyl oxidase like 2 protein (LOXL2) was the most effective strategy. Our explainable analysis showed that this ''hit-and-run" intervention is sufficient to erase the molecular signature driving resistance, allowing the network to self-correct without requiring sustained intervention. This study presents a novel, time-dependent therapeutic hypothesis for overcoming immunotherapy resistance and provides a powerful computational framework for identifying non-obvious intervention protocols in complex biological systems.

A PBN-RL-XAI Framework for Discovering a "Hit-and-Run" Therapeutic Strategy in Melanoma

TL;DR

This work tackles innate resistance to anti-PD-1 therapy in metastatic melanoma by building dynamic Probabilistic Boolean Network models from patient transcriptomics and applying reinforcement learning to discover time-dependent intervention schedules. Explainable AI (SHAP) then mechanistically interprets the learned control policies, revealing a JUN/LOXL2 regulatory axis that rigidifies the resistant state. A non-obvious hit-and-run strategy emerges: a precisely timed 4-step LOXL2 inhibition yielding high in silico success, suggesting transient perturbations can unlock the network’s self-correcting dynamics. The framework offers a generalizable computational approach for uncovering non-intuitive, temporally precise therapeutic strategies in complex regulatory networks, with potential implications for optimizing combination immunotherapies across cancers.

Abstract

Innate resistance to anti-PD-1 immunotherapy remains a major clinical challenge in metastatic melanoma, with the underlying molecular networks being poorly understood. To address this, we constructed a dynamic Probabilistic Boolean Network model using transcriptomic data from patient tumor biopsies to elucidate the regulatory logic governing therapy response. We then employed a reinforcement learning agent to systematically discover optimal, multi-step therapeutic interventions and used explainable artificial intelligence to mechanistically interpret the agent's control policy. The analysis revealed that a precisely timed, 4-step temporary inhibition of the lysyl oxidase like 2 protein (LOXL2) was the most effective strategy. Our explainable analysis showed that this ''hit-and-run" intervention is sufficient to erase the molecular signature driving resistance, allowing the network to self-correct without requiring sustained intervention. This study presents a novel, time-dependent therapeutic hypothesis for overcoming immunotherapy resistance and provides a powerful computational framework for identifying non-obvious intervention protocols in complex biological systems.

Paper Structure

This paper contains 20 sections, 5 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Shift in total gene influence from the responder (blue) to the non-responder (red) model. The plot highlights the change in influence score for each gene, sorted by their final influence in the non-responder state.
  • Figure 2: Efficacy of "hit-and-run" priming strategies. A 4-step inhibition of LOXL2 emerges as the optimal transient strategy, significantly outperforming other interventions.
  • Figure 3: Comparative SHAP analysis for the agent's decision to flip MAP2K3 versus JUN from a resistant state.
  • Figure 4: Residual SHAP importance after optimal (4-step LOXL2) versus sub-optimal (2-step JUN) priming. Effective LOXL2 priming requires less subsequent intervention.
  • Figure 5: SHAP trajectory analysis showing persistent vulnerability signatures. "Do Nothing" was most frequent action, with NRAS/RELA negative signals that agent learns to ignore, validating the "hit-and-run" strategy.
  • ...and 2 more figures