Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

Daniel Emerson; Nora Gaby-Biegel; Purva Joshi; Yoed Rabin; Rebecca D. Sandlin; Levent Burak Kara

Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

Daniel Emerson, Nora Gaby-Biegel, Purva Joshi, Yoed Rabin, Rebecca D. Sandlin, Levent Burak Kara

TL;DR

The paper tackles the challenge of designing cryoprotectant cocktail formulations for vitrification, balancing high CPA concentration with preserved cell viability across a large, multi-objective design space. It introduces a data-efficient framework that fuses high-throughput screening with active-learning guided by multi-objective Bayesian optimization, employing probabilistic surrogates to predict performance and quantify uncertainty. The approach selects next experiments by maximizing expected Pareto improvement under uncertainty and then updates the models with new results, achieving superior Pareto-front quality (hypervolume and IGD) while reducing experimental burden. Complementary synthetic studies demonstrate substantial sample efficiency, recovering strong Pareto-optimal sets using only a fraction of evaluations. The framework is generalizable to different CPA libraries, objective definitions, and cell lines, enabling accelerated cryopreservation development across diverse settings.

Abstract

Designing cryoprotectant agent (CPA) cocktails for vitrification is challenging because formulations must be concentrated enough to suppress ice formation yet non-toxic enough to preserve cell viability. This tradeoff creates a large, multi-objective design space in which traditional discovery is slow, often relying on expert intuition or exhaustive experimentation. We present a data-efficient framework that accelerates CPA cocktail design by combining high-throughput screening with an active-learning loop based on multi-objective Bayesian optimization. From an initial set of measured cocktails, we train probabilistic surrogate models to predict concentration and viability and quantify uncertainty across candidate formulations. We then iteratively select the next experiments by prioritizing cocktails expected to improve the Pareto front, maximizing expected Pareto improvement under uncertainty, and update the models as new assay results are collected. Wet-lab validation shows that our approach efficiently discovers cocktails that simultaneously achieve high CPA concentrations and high post-exposure viability. Relative to a naive strategy and a strong baseline, our method improves dominated hypervolume by 9.5\% and 4.5\%, respectively, while reducing the number of experiments needed to reach high-quality solutions. In complementary synthetic studies, it recovers a comparably strong set of Pareto-optimal solutions using only 30\% of the evaluations required by the prior state-of-the-art multi-objective approach, which amounts to saving approximately 10 weeks of experimental time. Because the framework assumes only a suitable assay and defined formulation space, it can be adapted to different CPA libraries, objective definitions, and cell lines to accelerate cryopreservation development.

Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

TL;DR

Abstract

Paper Structure (19 sections, 6 equations, 12 figures)

This paper contains 19 sections, 6 equations, 12 figures.

Introduction
Methods
High Throughput Screening
Active Learning
Results
Experimental Bayesian Optimization
Discussion
Conclusion and Future Directions
Appendix A: Bayesian Optimization Background
Exploration-Exploitation Dilemma
Bayesian Optimization
Gaussian Process Regression
Acquisition Functions
Bayesian Optimization Example
Multi-Objective Optimization
...and 4 more sections

Figures (12)

Figure 1: Overview of the iterative CPA cocktail optimization process
Figure 2: $R^2$ plots for the first two iterations of k-center active learning, with individual points colored according to total concentration of the cocktail. (a) For the first iteration, $R^2 = 0.44$, while (b) the second iteration improved to $R^2 = 0.91$.
Figure 3: Normalized Hypervolume (a) and Inverted Generational Distance (b) versus iteration for each of the four Bayesian optimization methods on the experimental data. Each method selects a batch of $q=10$ candidate CPA cocktails to evaluate at each iteration, with the experimental results being added to the training data for subsequent iterations.
Figure 4: Visualization of the Pareto front for each of the four methods after 8 iterations of Bayesian optimization. The orange points represent the Pareto optimal points for the given method, while the gray points are in the dominated set.
Figure 5: Composition of the experimentally determined Pareto optimal CPA cocktails for each of the four methods after 8 iterations of Bayesian optimization.
...and 7 more figures

Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

TL;DR

Abstract

Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (12)