Table of Contents
Fetching ...

Towards Learning Stochastic Population Models by Gradient Descent

Justin N. Kreikemeyer, Philipp Andelfinger, Adelinde M. Uhrmacher

TL;DR

The authors address learning mechanistic stochastic population models from data by jointly inferring parameters and model structure using gradient descent on simulation-based objectives. They introduce multiple formulations for learning reaction networks, a reparametrization to stabilize optimization, and a smoothed stochastic gradient estimator to handle jumps inherent in SSA trajectories. Through evaluation on SIR-style systems, they reveal a tradeoff between parsimony, fit, and scalability, with parsimony-enhancing constraints and reparametrization partially mitigating optimization difficulties. This work demonstrates the feasibility of gradient-based discovery for stochastic reaction networks while outlining key challenges and directions for improving identifiability, efficiency, and structure search.

Abstract

Increasing effort is put into the development of methods for learning mechanistic models from data. This task entails not only the accurate estimation of parameters but also a suitable model structure. Recent work on the discovery of dynamical systems formulates this problem as a linear equation system. Here, we explore several simulation-based optimization approaches, which allow much greater freedom in the objective formulation and weaker conditions on the available data. We show that even for relatively small stochastic population models, simultaneous estimation of parameters and structure poses major challenges for optimization procedures. Particularly, we investigate the application of the local stochastic gradient descent method, commonly used for training machine learning models. We demonstrate accurate estimation of models but find that enforcing the inference of parsimonious, interpretable models drastically increases the difficulty. We give an outlook on how this challenge can be overcome.

Towards Learning Stochastic Population Models by Gradient Descent

TL;DR

The authors address learning mechanistic stochastic population models from data by jointly inferring parameters and model structure using gradient descent on simulation-based objectives. They introduce multiple formulations for learning reaction networks, a reparametrization to stabilize optimization, and a smoothed stochastic gradient estimator to handle jumps inherent in SSA trajectories. Through evaluation on SIR-style systems, they reveal a tradeoff between parsimony, fit, and scalability, with parsimony-enhancing constraints and reparametrization partially mitigating optimization difficulties. This work demonstrates the feasibility of gradient-based discovery for stochastic reaction networks while outlining key challenges and directions for improving identifiability, efficiency, and structure search.

Abstract

Increasing effort is put into the development of methods for learning mechanistic models from data. This task entails not only the accurate estimation of parameters but also a suitable model structure. Recent work on the discovery of dynamical systems formulates this problem as a linear equation system. Here, we explore several simulation-based optimization approaches, which allow much greater freedom in the objective formulation and weaker conditions on the available data. We show that even for relatively small stochastic population models, simultaneous estimation of parameters and structure poses major challenges for optimization procedures. Particularly, we investigate the application of the local stochastic gradient descent method, commonly used for training machine learning models. We demonstrate accurate estimation of models but find that enforcing the inference of parsimonious, interpretable models drastically increases the difficulty. We give an outlook on how this challenge can be overcome.
Paper Structure (8 sections, 3 equations, 2 figures)

This paper contains 8 sections, 3 equations, 2 figures.

Figures (2)

  • Figure 1: The SIR model's response surface (left) and the effect of reparametrization (right). A darker color equals a lower loss and the star marks the optimum.
  • Figure 2: Convergence of gradient descent on the four problems (top) and chosen inferred models (bottom). Progress on the unsmoothed objective, the optimal solution has a loss of about $0.01$ (depending on the inferred system's stochasticity). The reaction system depicted for Library of Reactions shows only the top 3 of the 17 learned reactions above the threshold $10^{-4}$.