Interplay of Fidelity and Diversity in the Evolution of the Genetic Code
Yudam Seo, Tsvi Tlusty, Junghyo Jo
TL;DR
The paper tackles how the genetic code originated and why its mapping is so robust by treating code evolution as a multi-objective optimization balancing translation fidelity and amino acid diversity. It introduces a loss function $L = E + \eta D$, where $E$ captures mutation- and translation-error costs via polar-requirement distances and mutation rates, and $D$ enforces alignment with organismal amino-acid demand through a KL-divergence term between $f_\alpha$ and $p_\alpha$, with $\eta$ tuning their relative importance. Using a calibrated codon-mutation model and simulated annealing, the authors show the standard genetic code (SGC) sits near local optima and that the landscape contains rare, highly optimal codes across species; results indicate coevolution under conflicting pressures of fidelity and diversity. The work highlights that the current code is not only error-resilient but also tuned to reflect proteome composition, while acknowledging limitations like the non-evolutionary nature of simulated annealing trajectories and potential circularity in frequency estimates, suggesting avenues for experimental validation and broader evolutionary modeling.
Abstract
The origin and organizing principles of the genetic code remain fundamental puzzles in life science. The vanishingly low probability of the natural codon-to-amino acid mapping arising by chance has spurred the hypothesis that its structure is a solution optimized for robustness against mutations and translational errors. For the construction of effective molecular machines, the dictionary of encoded amino acids must also be diverse enough in physicochemical features. Here, we examine whether the standard genetic code can be understood as a near-optimal solution balancing these two objectives: minimizing error load and aligning codon assignments with the naturally occurring amino acid composition. Using simulated annealing, we explore this trade-off across a broad range of parameters. We find that the standard genetic code lies near local optima within the multidimensional parameter space. It is a highly effective solution that balances fidelity against resource availability constraints. These results suggest that the present genetic code reflects coevolution under conflicting pressures of fidelity and diversity, offering new insight into its emergence and evolution.
