Table of Contents
Fetching ...

Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications

Jinchao Feng, Charles Kulick, Sui Tang

TL;DR

The paper develops a data-driven Gaussian process framework to learn interaction kernels in multispecies particle systems, extending prior single-species theory to two-type agent dynamics with intra- and inter-species kernels $\{\phi^{pq}\}$. It establishes a rigorous statistical theory, including identifiability via a coercivity condition, a kernel-Ridge regression interpretation, and non-asymptotic error bounds for reconstruction and uncertainty quantification. The approach is validated through numerical experiments on two-species aggregation and predator–prey models, showing accurate kernel recovery and reliable trajectory predictions, with data efficiency and transferability to larger systems. This work advances multiscale modeling by providing a principled, uncertainty-aware method to infer microscopic interaction laws from macroscopic trajectory data and connects Bayesian GP learning with inverse-problem theory.

Abstract

We develop a Gaussian process framework for learning interaction kernels in multi-species interacting particle systems from trajectory data. Such systems provide a canonical setting for multiscale modeling, where simple microscopic interaction rules generate complex macroscopic behaviors. While our earlier work established a Gaussian process approach and convergence theory for single-species systems, and later extended to second-order models with alignment and energy-type interactions, the multi-species setting introduces new challenges: heterogeneous populations interact both within and across species, the number of unknown kernels grows, and asymmetric interactions such as predator-prey dynamics must be accommodated. We formulate the learning problem in a nonparametric Bayesian setting and establish rigorous statistical guarantees. Our analysis shows recoverability of the interaction kernels, provides quantitative error bounds, and proves statistical optimality of posterior estimators, thereby unifying and generalizing previous single-species theory. Numerical experiments confirm the theoretical predictions and demonstrate the effectiveness of the proposed approach, highlighting its advantages over existing kernel-based methods. This work contributes a complete statistical framework for data-driven inference of interaction laws in multi-species systems, advancing the broader multiscale modeling program of connecting microscopic particle dynamics with emergent macroscopic behavior.

Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications

TL;DR

The paper develops a data-driven Gaussian process framework to learn interaction kernels in multispecies particle systems, extending prior single-species theory to two-type agent dynamics with intra- and inter-species kernels . It establishes a rigorous statistical theory, including identifiability via a coercivity condition, a kernel-Ridge regression interpretation, and non-asymptotic error bounds for reconstruction and uncertainty quantification. The approach is validated through numerical experiments on two-species aggregation and predator–prey models, showing accurate kernel recovery and reliable trajectory predictions, with data efficiency and transferability to larger systems. This work advances multiscale modeling by providing a principled, uncertainty-aware method to infer microscopic interaction laws from macroscopic trajectory data and connects Bayesian GP learning with inverse-problem theory.

Abstract

We develop a Gaussian process framework for learning interaction kernels in multi-species interacting particle systems from trajectory data. Such systems provide a canonical setting for multiscale modeling, where simple microscopic interaction rules generate complex macroscopic behaviors. While our earlier work established a Gaussian process approach and convergence theory for single-species systems, and later extended to second-order models with alignment and energy-type interactions, the multi-species setting introduces new challenges: heterogeneous populations interact both within and across species, the number of unknown kernels grows, and asymmetric interactions such as predator-prey dynamics must be accommodated. We formulate the learning problem in a nonparametric Bayesian setting and establish rigorous statistical guarantees. Our analysis shows recoverability of the interaction kernels, provides quantitative error bounds, and proves statistical optimality of posterior estimators, thereby unifying and generalizing previous single-species theory. Numerical experiments confirm the theoretical predictions and demonstrate the effectiveness of the proposed approach, highlighting its advantages over existing kernel-based methods. This work contributes a complete statistical framework for data-driven inference of interaction laws in multi-species systems, advancing the broader multiscale modeling program of connecting microscopic particle dynamics with emergent macroscopic behavior.

Paper Structure

This paper contains 38 sections, 18 theorems, 118 equations, 7 figures, 9 tables, 1 algorithm.

Key Result

Lemma 4.2

By Assumption 1, we have that, for any ${\bm{\varphi}} = (\varphi_{pq}) \in \prod_{p,q} \mathcal{H}_{{K}^{pq}}$, there holds $\|\varphi_{pq}\|_{\infty}\leq \kappa_{pq} \|\varphi_{pq}\|_{\mathcal{H}_{{K}^{pq}}}$.

Figures (7)

  • Figure 1: Results of kernel learning for the repulsive potential dynamics with $N_1 = N_2 = 10$, $L=10$, and $M=10$ with noise $\sigma = 0.01$. Left, Center: The four interaction kernels are shown with true function in black and predicted mean in blue, with the shaded region indicating the standard deviation band. Gray bars show the empirical distribution of pairwise distances. Right: Training and testing data trajectory prediction plots on $[0,2T]$ are presented, with the true dynamics on the left of each pair and the predicted dynamics on the right. A black dot marks each trajectory at the time snapshot $t=T$. The top pair utilizes a training trajectory to test temporal generalization, while the bottom pair uses test data. The system evolution and steady-state behavior are extremely similar when using the predicted interaction functions.
  • Figure 2: Analysis of the noise dependence of kernel learning and trajectory prediction errors as a function of the noise level $\sigma$ for the repulsive potential dynamics on a log-log plot. Each curve shows the mean error across ten random seeds, with error bars indicating standard deviation. (Left) Relative $L^2(\tilde{\rho}_T^{pq,L})$ errors for the four interaction kernels. (Center) Relative $L^\infty([0,R])$ errors for the four interaction kernels. Note the consistent linear behavior; the slope $\alpha$ in the legend indicates the power-law rate of error growth (error $\sim \sigma^\alpha$) as the noise increases. Once noise is very small, bias (discretization + finite basis) dominates, hence the plateau. (Right) Relative trajectory prediction errors for training data (blue) and test data (red) on both the training period $[0,T]$ and temporal generalization period $[T,2T]$. $L^2(\tilde{\rho}_T^{pq,L})$ error and trajectory error steadily decrease until around $\sigma = 10^{-3}$, with smaller noise levels yielding diminished returns past this point as they approach the zero noise accuracy level.
  • Figure 3: Convergence analysis of kernel learning and trajectory prediction errors as a function of the number of training trajectories $M$ for the linear-repulsive potential dynamics. Each curve shows the mean error across ten random seeds, with error bars indicating standard deviation. The slope $\alpha$ in the legend indicates the power-law convergence rate (error $\sim M^\alpha$). (Left) Relative $L^2(\tilde{\rho}_T^{pq,L})$ errors for the four interaction kernels. (Center) Relative $L^\infty([0,R])$ errors for the four interaction kernels. (Right) Relative trajectory prediction errors for training data (blue) and test data (red) on both the training period $[0,T]$ and temporal generalization period $[T,2T]$.
  • Figure 4: Results of kernel learning for the linear-repulsive potential dynamics with $N_1 = N_2 = 10$, $L=5$, and $M=5$ with noise $\sigma = 0.01$. Left, Center: The four interaction kernels are shown with true function in black and predicted mean in blue, with the shaded region indicating the standard deviation band. Gray bars show the empirical distribution of pairwise distances. Right: Training and testing data trajectory prediction plots on $[0,2T]$ are presented, with the true dynamics on the left of each pair and the predicted dynamics on the right. A black dot marks each trajectory at the time snapshot $t=T$. The top pair utilizes a training trajectory to test temporal generalization while the bottom pair uses test data. The predicted interaction functions are sufficiently accurate to closely reconstruct the true dynamics.
  • Figure 5: Kernel learning result for linear-repulsive potentials, with learned kernels from $N_1 = N_2 = 10$ used for dynamics prediction on systems with larger numbers of particles, $N_1 = N_2 = 50$ and $N_1 = N_2 = 100$ . Learned kernels transfer well and predict dynamics with high fidelity.
  • ...and 2 more figures

Theorems & Definitions (34)

  • Lemma 4.2
  • proof
  • Definition 4.3
  • Theorem 4.4
  • Theorem 4.5: $\mathcal{H}_{{K}}$-bound
  • Theorem 4.6: Convergence rate of reconstruction error in $\prod_{p,q} \mathcal{H}_{{K}^{pq}}$ norm
  • Lemma A.2
  • proof
  • Proposition A.3
  • proof
  • ...and 24 more