Table of Contents
Fetching ...

Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL

Xingyu Chen, Shihao Ma, Runsheng Lin, Jiecong Lin, Bo Wang

TL;DR

This work addresses the problem of designing CREs with high target-cell expression and controlled off-target activity. It formulates CRE design as a constrained Markov decision process and solves it with constrained reinforcement learning using Lagrangian multipliers, maximizing $J_0(\theta)$ while enforcing $J_i(\theta)\le \delta_i$ for off-target cell types and computing policy gradients from batch-normalized rewards to avoid separate value models. A TFBS-based regularization term $R_{TFBS}(X)=\text{Corr}(q_{gen},q_{real})$ aligns generated motifs with biologically realistic distributions. Empirical results on human enhancer and promoter MPRA datasets across six cell types show Ctrl-DNA achieves higher target activity and better constraint satisfaction than baselines, while preserving motif plausibility and diversity. This constraint-aware RL approach provides a scalable pathway for cell-type-specific CRE design with potential impact in gene therapy and synthetic biology.

Abstract

Designing regulatory DNA sequences that achieve precise cell-type-specific gene expression is crucial for advancements in synthetic biology, gene therapy and precision medicine. Although transformer-based language models (LMs) can effectively capture patterns in regulatory DNA, their generative approaches often struggle to produce novel sequences with reliable cell-specific activity. Here, we introduce Ctrl-DNA, a novel constrained reinforcement learning (RL) framework tailored for designing regulatory DNA sequences with controllable cell-type specificity. By formulating regulatory sequence design as a biologically informed constrained optimization problem, we apply RL to autoregressive genomic LMs, enabling the models to iteratively refine sequences that maximize regulatory activity in targeted cell types while constraining off-target effects. Our evaluation on human promoters and enhancers demonstrates that Ctrl-DNA consistently outperforms existing generative and RL-based approaches, generating high-fitness regulatory sequences and achieving state-of-the-art cell-type specificity. Moreover, Ctrl-DNA-generated sequences capture key cell-type-specific transcription factor binding sites (TFBS), short DNA motifs recognized by regulatory proteins that control gene expression, demonstrating the biological plausibility of the generated sequences.

Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL

TL;DR

This work addresses the problem of designing CREs with high target-cell expression and controlled off-target activity. It formulates CRE design as a constrained Markov decision process and solves it with constrained reinforcement learning using Lagrangian multipliers, maximizing while enforcing for off-target cell types and computing policy gradients from batch-normalized rewards to avoid separate value models. A TFBS-based regularization term aligns generated motifs with biologically realistic distributions. Empirical results on human enhancer and promoter MPRA datasets across six cell types show Ctrl-DNA achieves higher target activity and better constraint satisfaction than baselines, while preserving motif plausibility and diversity. This constraint-aware RL approach provides a scalable pathway for cell-type-specific CRE design with potential impact in gene therapy and synthetic biology.

Abstract

Designing regulatory DNA sequences that achieve precise cell-type-specific gene expression is crucial for advancements in synthetic biology, gene therapy and precision medicine. Although transformer-based language models (LMs) can effectively capture patterns in regulatory DNA, their generative approaches often struggle to produce novel sequences with reliable cell-specific activity. Here, we introduce Ctrl-DNA, a novel constrained reinforcement learning (RL) framework tailored for designing regulatory DNA sequences with controllable cell-type specificity. By formulating regulatory sequence design as a biologically informed constrained optimization problem, we apply RL to autoregressive genomic LMs, enabling the models to iteratively refine sequences that maximize regulatory activity in targeted cell types while constraining off-target effects. Our evaluation on human promoters and enhancers demonstrates that Ctrl-DNA consistently outperforms existing generative and RL-based approaches, generating high-fitness regulatory sequences and achieving state-of-the-art cell-type specificity. Moreover, Ctrl-DNA-generated sequences capture key cell-type-specific transcription factor binding sites (TFBS), short DNA motifs recognized by regulatory proteins that control gene expression, demonstrating the biological plausibility of the generated sequences.

Paper Structure

This paper contains 20 sections, 14 equations, 4 figures, 7 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of the Ctrl-DNA framework for controllable regulatory sequence generation. Ctrl-DNA builds on a pre-trained autoregressive DNA language model and applies constrained reinforcement learning to guide sequence generation toward high fitness in a target cell type (e.g., HepG2) while suppressing off-target fitness (e.g., K562, SK-N-SH), enabling the generation of CREs with strong cell-type specificity.
  • Figure 2: Pairwise fitness comparison of generated CREs highlights Ctrl-DNA’s cell-type specificity. Each subplot compares mean$\pm$s.d. fitness in two human cell lines (y = target, x = off-target); points in the top-right denote sequences with high on-target and low off-target fitness. Baseline methods are shown in pastel colors, while Ctrl-DNA variants ($\delta$ = 0.3/0.4, 0.5, 0.6) are connected in red dotted lines, illustrating the trade-off as constraint strength increases and Ctrl-DNA’s dominance in the top-right corner for both enhancer (a) and promoter (b) datasets.
  • Figure 3: (a) Fraction of Ctrl-DNA-generated enhancers containing selected cell type-specific transcription factor (TF) motifs. (b) Diversity scores of generated sequences for HepG2 enhancers (left) and K562 promoters (right) across different methods.
  • Figure 4: Sequence diversity scores for generated sequences on the human enhancer and promoter datasets. Higher values indicate greater variability among generated sequences.