Table of Contents
Fetching ...

Deep Reinforcement Learning for Controlled Traversing of the Attractor Landscape of Boolean Models in the Context of Cellular Reprogramming

Andrzej Mizera, Jakub Zarzycki

TL;DR

This work addresses the challenge of designing scalable interventions for cellular reprogramming by framing it as source-target attractor control in BN and PBN models under asynchronous updates. It introduces pbn-STAC, a DRL-based framework that uses pseudo-attractors (PASIP) to identify frequently revisited states during training and an exploration probability boost (EPB) to stabilize learning, with Branching Dueling Q-Networks (BDQ) to handle multi-gene perturbations. The approach delivers control strategies that are competitive with optimal solutions wherever ground truth is available and demonstrates robustness on realistic GRN models of melanoma and IRBB-33, highlighting potential for scalable wet-lab-applicable reprogramming guidance. The work advances scalable, realistic DRL-based control in large GRNs and provides a concrete pathway toward aiding cellular reprogramming experiments through in silico predictions.

Abstract

Cellular reprogramming can be used for both the prevention and cure of different diseases. However, the efficiency of discovering reprogramming strategies with classical wet-lab experiments is hindered by lengthy time commitments and high costs. In this study, we develop a novel computational framework based on deep reinforcement learning that facilitates the identification of reprogramming strategies. For this aim, we formulate a control problem in the context of cellular reprogramming for the frameworks of BNs and PBNs under the asynchronous update mode. Furthermore, we introduce the notion of a pseudo-attractor and a procedure for identification of pseudo-attractor state during training. Finally, we devise a computational framework for solving the control problem, which we test on a number of different models.

Deep Reinforcement Learning for Controlled Traversing of the Attractor Landscape of Boolean Models in the Context of Cellular Reprogramming

TL;DR

This work addresses the challenge of designing scalable interventions for cellular reprogramming by framing it as source-target attractor control in BN and PBN models under asynchronous updates. It introduces pbn-STAC, a DRL-based framework that uses pseudo-attractors (PASIP) to identify frequently revisited states during training and an exploration probability boost (EPB) to stabilize learning, with Branching Dueling Q-Networks (BDQ) to handle multi-gene perturbations. The approach delivers control strategies that are competitive with optimal solutions wherever ground truth is available and demonstrates robustness on realistic GRN models of melanoma and IRBB-33, highlighting potential for scalable wet-lab-applicable reprogramming guidance. The work advances scalable, realistic DRL-based control in large GRNs and provides a concrete pathway toward aiding cellular reprogramming experiments through in silico predictions.

Abstract

Cellular reprogramming can be used for both the prevention and cure of different diseases. However, the efficiency of discovering reprogramming strategies with classical wet-lab experiments is hindered by lengthy time commitments and high costs. In this study, we develop a novel computational framework based on deep reinforcement learning that facilitates the identification of reprogramming strategies. For this aim, we formulate a control problem in the context of cellular reprogramming for the frameworks of BNs and PBNs under the asynchronous update mode. Furthermore, we introduce the notion of a pseudo-attractor and a procedure for identification of pseudo-attractor state during training. Finally, we devise a computational framework for solving the control problem, which we test on a number of different models.
Paper Structure (29 sections, 6 theorems, 21 equations, 9 figures, 3 tables)

This paper contains 29 sections, 6 theorems, 21 equations, 9 figures, 3 tables.

Key Result

Theorem 1

Let $A$ be an attractor of a PBN. Then there exists a pseudo-attractor $PA \subseteq A$ such that $|PA| \geq 1$.

Figures (9)

  • Figure 1: Average episode lengths during DRL agent training run without and with EPB while new pseudo-attractor states are being found causing abrupt episode length increases.
  • Figure 2: Heatmaps of strategy lengths for individual source-target (pseudo-)attractor states averaged over 10 runs.
  • Figure 3: Training of the DRL agent on the IRBB-33 environment with different reward schemes.
  • Figure 4: STG of the PBN defined in Example \ref{['running']} under the asynchronous update mode. Shaded states are the attractor states of the three attractors, i.e., two fixed-point attractors $A_1=\{(0,0,0,0)\}$ and $A_2=\{(0, 1, 0,1)\}$, and one multi-state attractor $A_3 = \{(1,0,0,0), (1,0,1,0)\}$.
  • Figure 5: Schematic representation of the BDQ architecture of the DRL agent in pbn-STAC.
  • ...and 4 more figures

Theorems & Definitions (21)

  • Definition 1: Pseudo-attractor
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Definition 2: Attractor-based Control Strategy
  • Definition 3: Source-Target Attractor Control
  • Theorem 3
  • proof
  • Definition 4: Boolean network
  • ...and 11 more