Neural Ordinary Differential Equations for Simulating Metabolic Pathway Dynamics from Time-Series Multiomics Data
Udesh Habaraduwa, Andrei Lixandru
TL;DR
The paper addresses the challenge of predicting complex metabolic pathway dynamics from noisy time-series multiomics data. It introduces Neural Ordinary Differential Equations to learn continuous, latent interactions between the proteome and metabolome in engineered E. coli, outperforming traditional baseline pipelines in modeling pathways such as limonene and isopentenol. Key contributions include significant RMSE improvements (up to ~95%), substantial speedups in training and inference, and the demonstration that physiology-informed regularization improves learning. This approach enables rapid, high-fidelity simulations for metabolic engineering and tailored biomedical applications, with potential to accelerate design iterations and discovery.
Abstract
The advancement of human healthspan and bioengineering relies heavily on predicting the behavior of complex biological systems. While high-throughput multiomics data is becoming increasingly abundant, converting this data into actionable predictive models remains a bottleneck. High-capacity, datadriven simulation systems are critical in this landscape; unlike classical mechanistic models restricted by prior knowledge, these architectures can infer latent interactions directly from observational data, allowing for the simulation of temporal trajectories and the anticipation of downstream intervention effects in personalized medicine and synthetic biology. To address this challenge, we introduce Neural Ordinary Differential Equations (NODEs) as a dynamic framework for learning the complex interplay between the proteome and metabolome. We applied this framework to time-series data derived from engineered Escherichia coli strains, modeling the continuous dynamics of metabolic pathways. The proposed NODE architecture demonstrates superior performance in capturing system dynamics compared to traditional machine learning pipelines. Our results show a greater than 90% improvement in root mean squared error over baselines across both Limonene (up to 94.38% improvement) and Isopentenol (up to 97.65% improvement) pathway datasets. Furthermore, the NODE models demonstrated a 1000x acceleration in inference time, establishing them as a scalable, high-fidelity tool for the next generation of metabolic engineering and biological discovery.
