Joint Probability Distribution of mRNA and Protein Molecules in a Stochastic Gene Expression Model
Yuntao Lu, Yunxin Zhang
TL;DR
This work addresses the challenge of analytically analyzing stochastic gene expression models with multiple gene states by deriving a hierarchy of ODEs for two-dimensional binomial moments $\mathcal{B}_{p,q}$ from the chemical master equation via a generating-function formulation. The authors show that, in steady state, these binomial moments satisfy a linear system from which the joint distribution $\mathbb{P}(m,n)$ of mRNA and protein can be reconstructed using $\mathbb{P}(m,n)=\sum_{p=m}^\infty\sum_{q=n}^\infty (-1)^{p+q+m+n}\binom{p}{m}\binom{q}{n}\mathcal{B}_{p,q}$. They provide explicit low-order moment expressions, establish a stable hierarchical algorithm to compute moments to arbitrary order, and compare the exact results with burst-approximation models, showing that the mean is preserved while the variance may differ, with a quantitative bound on the discrepancy. The method enables rigorous, non-burst analyses of complete gene-expression models and offers a practical route to quantify when burst approximations fail, with potential extensions to multiscale modeling and MSM construction from molecular dynamics data.
Abstract
Stochastic modeling of gene expression is a classic problem in theoretical biophysics. However, models formulated via chemical master equation have long been considered analytically intractable unless burst approximation is applied. This article shows that general stochastic gene expression models with an arbitrary number of gene states admit direct analysis. Based on chemical master equation and high-dimensional binomial moment method, we derive recurrence relations for binomial moments in steady state, yielding analytical expressions to arbitrary order in a hierarchical manner. Subsequently, the joint probability mass function of mRNA and protein copy number can be reconstructed. An algorithm is developed for numerical computation. Particularly, explicit expressions for low-order cumulants are presented. Compared with models under burst approximation, the mean remains exact, whereas the variance typically differs. We estimate the difference between two second-order binomial moments using functional analysis, therefore evaluating the validity of burst approximation.
