Table of Contents
Fetching ...

Conformal Prediction in Dynamic Biological Systems

Alberto Portela, Julio R. Banga, Marcos Matabuena

TL;DR

The use of conformal inference methods are proposed, introducing two novel algorithms that, in some instances, offer non-asymptotic guarantees, enhancing robustness and scalability across various applications and offering a general framework to quantify uncertainty for dynamic models of biological systems.

Abstract

Uncertainty quantification (UQ) is the process of systematically determining and characterizing the degree of confidence in computational model predictions. In the context of systems biology, especially with dynamic models, UQ is crucial because it addresses the challenges posed by nonlinearity and parameter sensitivity, allowing us to properly understand and extrapolate the behavior of complex biological systems. Here, we focus on dynamic models represented by deterministic nonlinear ordinary differential equations. Many current UQ approaches in this field rely on Bayesian statistical methods. While powerful, these methods often require strong prior specifications and make parametric assumptions that may not always hold in biological systems. Additionally, these methods face challenges in domains where sample sizes are limited, and statistical inference becomes constrained, with computational speed being a bottleneck in large models of biological systems. As an alternative, we propose the use of conformal inference methods, introducing two novel algorithms that, in some instances, offer non-asymptotic guarantees, enhancing robustness and scalability across various applications. We demonstrate the efficacy of our proposed algorithms through several scenarios, highlighting their advantages over traditional Bayesian approaches. The proposed methods show promising results for diverse biological data structures and scenarios, offering a general framework to quantify uncertainty for dynamic models of biological systems.The software for the methodology and the reproduction of the results is available at https://zenodo.org/doi/10.5281/zenodo.13644870.

Conformal Prediction in Dynamic Biological Systems

TL;DR

The use of conformal inference methods are proposed, introducing two novel algorithms that, in some instances, offer non-asymptotic guarantees, enhancing robustness and scalability across various applications and offering a general framework to quantify uncertainty for dynamic models of biological systems.

Abstract

Uncertainty quantification (UQ) is the process of systematically determining and characterizing the degree of confidence in computational model predictions. In the context of systems biology, especially with dynamic models, UQ is crucial because it addresses the challenges posed by nonlinearity and parameter sensitivity, allowing us to properly understand and extrapolate the behavior of complex biological systems. Here, we focus on dynamic models represented by deterministic nonlinear ordinary differential equations. Many current UQ approaches in this field rely on Bayesian statistical methods. While powerful, these methods often require strong prior specifications and make parametric assumptions that may not always hold in biological systems. Additionally, these methods face challenges in domains where sample sizes are limited, and statistical inference becomes constrained, with computational speed being a bottleneck in large models of biological systems. As an alternative, we propose the use of conformal inference methods, introducing two novel algorithms that, in some instances, offer non-asymptotic guarantees, enhancing robustness and scalability across various applications. We demonstrate the efficacy of our proposed algorithms through several scenarios, highlighting their advantages over traditional Bayesian approaches. The proposed methods show promising results for diverse biological data structures and scenarios, offering a general framework to quantify uncertainty for dynamic models of biological systems.The software for the methodology and the reproduction of the results is available at https://zenodo.org/doi/10.5281/zenodo.13644870.
Paper Structure (18 sections, 10 equations, 8 figures, 9 tables, 2 algorithms)

This paper contains 18 sections, 10 equations, 8 figures, 9 tables, 2 algorithms.

Figures (8)

  • Figure 1: Comparative analysis of the Logistic model predictive regions. This figure presents the $95\%$ predictive regions obtained from a 10-point dataset subjected to $10\%$ noise. The left subplot showcases results using four different methodologies: our two proposed methods (CUQDyn1 and CUQDyn2), the original jackknife+ method and a Bayesian approach implemented with STAN. The right subplot shows the predictive region and the predicted model for the CUQDyn1 algorithm applied to the same dataset. Numerical results related to this example are available in the Supplementary Information.
  • Figure 2: Boxplot of marginal coverage $\mathbb{P}(Y_{n+1} \in \widehat{C}^{\alpha}(X_{n+1}))$ for different sample sizes and $\alpha = 0.05$, $0.1$, and $0.5$ of our first algorithm, CUQDyn1, is presented for different noise levels ($0\%$, $1\%$, $5\%$, and $10\%$) across different columnss. The results remain very stable across all examined cases for larger sample sizes of 100 temporal points.
  • Figure 3: Comparative analysis of the Lotka-Volterra model predictive regions for the second state. This figure presents the $95\%$ predictive regions obtained from a 30-point dataset subjected to $10\%$ noise. The left subplot showcase results using three different methodologies: our two proposed methods (CUQDyn1 and CUQDyn2) and a Bayesian approach implemented with STAN. The right subplot shows the predictive region and the predicted model for the CUQDyn2 algorithm applied to the same dataset. Numerical results related to this example are available in the Supplementary Information.
  • Figure 4: Comparative analysis of the $\alpha$-pinene isomerization model predictive regions for the second state. This figure presents the $95\%$ predictive regions obtained from a 9-point real dataset. The left subplot showcases results using three different methodologies: our two proposed methods (CUQDyn1 and CUQDyn2) and a Bayesian approach implemented with STAN. The right subplot shows the predictive region and the predicted model for the CUQDyn1 algorithm applied to the same dataset. Numerical results related to this example are available in the Supplementary Information.
  • Figure 5: Comparative analysis of the Logistic model predictive regions. This figure presents the $95\%$ predictive regions obtained from a 10-point dataset subjected to $10\%$ noise. The left subplot showcases results using four different methodologies: our two proposed methods (CUQDyn1 and CUQDyn2), the original jackknife+ method and a Bayesian approach implemented with STAN. The right subplot shows the predictive region and the predicted model for the CUQDyn1 algorithm applied to the same dataset.
  • ...and 3 more figures