Table of Contents
Fetching ...

Mixtures of ensembles: System separation and identification via optimal transport

Filip Elvander, Isabel Haasler

TL;DR

The paper tackles the challenge of identifying and separating multiple heterogeneous subpopulations from aggregate observations while inferring each subpopulation's dynamics. It introduces an optimal transport-based framework that jointly decomposes the population into $K$ ensembles and identifies their dynamical systems via a bi-convex optimization solved by block coordinate descent, with convergence guarantees. Empirical results show the method attains close-to-oracle performance, maintaining high ensemble classification accuracy even under substantial noise. This approach enables robust inference in domains where only aggregate, not individual, trajectories are observable and can impact fields ranging from crowd dynamics to single-cell biology.

Abstract

Crowd dynamics and many large biological systems can be described as populations of agents or particles, which can only be observed on aggregate population level. Identifying the dynamics of agents is crucial for understanding these large systems. However, the population of agents is typically not homogeneous, and thus the aggregate observations consist of the superposition of multiple ensembles each governed by individual dynamics. In this work, we propose an optimal transport framework to jointly separate the population into several ensembles and identify each ensemble's dynamical system, based on aggregate observations of the population. We propose a bi-convex optimization problem, which we solve using a block coordinate descent with convergence guarantees. In numerical experiments, we demonstrate that the proposed approach exhibits close-to-oracle performance also in noisy settings, yielding accurate estimates of both the ensembles and the parameters governing their dynamics.

Mixtures of ensembles: System separation and identification via optimal transport

TL;DR

The paper tackles the challenge of identifying and separating multiple heterogeneous subpopulations from aggregate observations while inferring each subpopulation's dynamics. It introduces an optimal transport-based framework that jointly decomposes the population into ensembles and identifies their dynamical systems via a bi-convex optimization solved by block coordinate descent, with convergence guarantees. Empirical results show the method attains close-to-oracle performance, maintaining high ensemble classification accuracy even under substantial noise. This approach enables robust inference in domains where only aggregate, not individual, trajectories are observable and can impact fields ranging from crowd dynamics to single-cell biology.

Abstract

Crowd dynamics and many large biological systems can be described as populations of agents or particles, which can only be observed on aggregate population level. Identifying the dynamics of agents is crucial for understanding these large systems. However, the population of agents is typically not homogeneous, and thus the aggregate observations consist of the superposition of multiple ensembles each governed by individual dynamics. In this work, we propose an optimal transport framework to jointly separate the population into several ensembles and identify each ensemble's dynamical system, based on aggregate observations of the population. We propose a bi-convex optimization problem, which we solve using a block coordinate descent with convergence guarantees. In numerical experiments, we demonstrate that the proposed approach exhibits close-to-oracle performance also in noisy settings, yielding accurate estimates of both the ensembles and the parameters governing their dynamics.

Paper Structure

This paper contains 9 sections, 2 theorems, 14 equations, 5 figures, 1 algorithm.

Key Result

Proposition 1

If the dynamics $\Phi_\theta$ in eq:dyanmics are linear in the parameter $\theta$, then the non-convex problem eq:problem_form is bi-convex in the sets

Figures (5)

  • Figure 1: Time evolution of systems belonging to $K = 3$ different ensembles with different linear dynamics for the state $x^{(t)} = (x_1,x_2) \in {\mathbb{R}}^2$. Three consecutive time steps $t$ are shown.
  • Figure 2: Classical optimal transport between two Gaussian mixtures. The left plot shows the given distributions and optimal the transport plan $m$, where dark areas correspond to support of $m$ in ${\mathbb{R}} \times {\mathbb{R}}$. The right plot shows the evolution of the distribution over time from $\mu$ to $\nu$.
  • Figure 3: Separated optimal transport between two Gaussian mixtures. The plots on the left show the given distributions and optimal the transport plans $m_1$ and $m_2$, where dark areas correspond to support in ${\mathbb{R}}^2$. The plots on the right show the corresponding evolutions of the distributions over time.
  • Figure 4: Squared error for estimates of the ensemble dynamics parameters for discrete distributions observed over $T = 7$ time points, for varying noise level. The number of ensembles is $K = 3$. The lines correspond to the median over 500 simulations, and the confidence regions cover 90% of the observed errors.
  • Figure 5: The fraction of particles correctly grouped into their corresponding ensembles for the same scenario as Figure \ref{['fig:ex_3groups_param_error']}. The lines correspond to the median over 500 simulations, and the confidence regions cover 90% of the observed classification fractions.

Theorems & Definitions (4)

  • Example 1: Linear dynamical systems
  • Proposition 1
  • Proposition 2
  • proof