Table of Contents
Fetching ...

Regulatory DNA sequence Design with Reinforcement Learning

Zhao Yang, Bing Su, Chuan Cao, Ji-Rong Wen

TL;DR

This work tackles CRE design by framing it as an RL-driven optimization of a pre-trained autoregressive DNA model, enabling generation of novel high-fitness sequences while preserving diversity. It integrates domain knowledge by inferring activator and repressor TFBS roles through a SHAP-guided analysis of TFBS frequency features and injecting TFBS-based rewards into the RL objective. The approach, named TACO, demonstrates superior performance across yeast promoters and human enhancers, including in offline MBO settings, by balancing data-driven guidance with biological priors. This TFBS-aware design framework offers a practical path to more efficient CRE design for therapeutic and biotechnological applications, with broad implications for sequence-level design in regulatory genomics.

Abstract

Cis-regulatory elements (CREs), such as promoters and enhancers, are relatively short DNA sequences that directly regulate gene expression. The fitness of CREs, measured by their ability to modulate gene expression, highly depends on the nucleotide sequences, especially specific motifs known as transcription factor binding sites (TFBSs). Designing high-fitness CREs is crucial for therapeutic and bioengineering applications. Current CRE design methods are limited by two major drawbacks: (1) they typically rely on iterative optimization strategies that modify existing sequences and are prone to local optima, and (2) they lack the guidance of biological prior knowledge in sequence optimization. In this paper, we address these limitations by proposing a generative approach that leverages reinforcement learning (RL) to fine-tune a pre-trained autoregressive (AR) model. Our method incorporates data-driven biological priors by deriving computational inference-based rewards that simulate the addition of activator TFBSs and removal of repressor TFBSs, which are then integrated into the RL process. We evaluate our method on promoter design tasks in two yeast media conditions and enhancer design tasks for three human cell types, demonstrating its ability to generate high-fitness CREs while maintaining sequence diversity. The code is available at https://github.com/yangzhao1230/TACO.

Regulatory DNA sequence Design with Reinforcement Learning

TL;DR

This work tackles CRE design by framing it as an RL-driven optimization of a pre-trained autoregressive DNA model, enabling generation of novel high-fitness sequences while preserving diversity. It integrates domain knowledge by inferring activator and repressor TFBS roles through a SHAP-guided analysis of TFBS frequency features and injecting TFBS-based rewards into the RL objective. The approach, named TACO, demonstrates superior performance across yeast promoters and human enhancers, including in offline MBO settings, by balancing data-driven guidance with biological priors. This TFBS-aware design framework offers a practical path to more efficient CRE design for therapeutic and biotechnological applications, with broad implications for sequence-level design in regulatory genomics.

Abstract

Cis-regulatory elements (CREs), such as promoters and enhancers, are relatively short DNA sequences that directly regulate gene expression. The fitness of CREs, measured by their ability to modulate gene expression, highly depends on the nucleotide sequences, especially specific motifs known as transcription factor binding sites (TFBSs). Designing high-fitness CREs is crucial for therapeutic and bioengineering applications. Current CRE design methods are limited by two major drawbacks: (1) they typically rely on iterative optimization strategies that modify existing sequences and are prone to local optima, and (2) they lack the guidance of biological prior knowledge in sequence optimization. In this paper, we address these limitations by proposing a generative approach that leverages reinforcement learning (RL) to fine-tune a pre-trained autoregressive (AR) model. Our method incorporates data-driven biological priors by deriving computational inference-based rewards that simulate the addition of activator TFBSs and removal of repressor TFBSs, which are then integrated into the RL process. We evaluate our method on promoter design tasks in two yeast media conditions and enhancer design tasks for three human cell types, demonstrating its ability to generate high-fitness CREs while maintaining sequence diversity. The code is available at https://github.com/yangzhao1230/TACO.

Paper Structure

This paper contains 30 sections, 9 equations, 9 figures, 19 tables, 1 algorithm.

Figures (9)

  • Figure 1: (A) TFBSs are commonly represented as frequency matrices, indicating the frequency of each nucleotide appearing at specific positions within the binding site. (B) GATA2 and HNF1B specifically activate gene expression in blood cells and liver cells, respectively, while REST specifically represses gene expression in neural cells.
  • Figure 2: AR generation of a DNA sequence. The action $a_i$ represents the nucleotide to be appended to the sequence, and the state $s_{i-1}$ is the concatenation of all actions taken up to time $i-1$. A negative reward is given if an action generates a repressive TFBS, while a positive reward is given for an activating TFBS. The final sequence is evaluated using a reward model (oracle) to obtain its fitness. BOS represents the beginning of the sequence, and ATCG denotes the nucleotide bases.
  • Figure 3: A black-box LightGBM model takes TFBS occurrences as input, and SHAP values infer their contributions to CRE fitness prediction.
  • Figure 4: Evaluation metrics by optimization round for TACO, BO, PEX and Adalead. Shaded regions indicate the standard deviation of 5 runs. The x-axis indicates the number of rounds.
  • Figure 5: Venn diagram categorizing TFBSs by their functional roles (positive, neutral, or negative) in yeast promoters across two media conditions (Complex and Defined). The substantial overlap across all categories indicates that TFBSs maintain consistent regulatory functions regardless of media composition, confirming the absence of condition-specific TFBS activity within the same cell type.
  • ...and 4 more figures