Table of Contents
Fetching ...

Federated Nonlinear System Identification

Omkar Tupe, Max Hartman, Lav R. Varshney, Saurav Prakash

TL;DR

Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases, demonstrating that the convergence rate improves as the number of clients increases.

Abstract

We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $φ$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.

Federated Nonlinear System Identification

TL;DR

Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases, demonstrating that the convergence rate improves as the number of clients increases.

Abstract

We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map , which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.

Paper Structure

This paper contains 22 sections, 5 theorems, 34 equations, 6 figures, 1 algorithm.

Key Result

Lemma 1

Let each client $i\in[M]$ run open‑loop inputs $u^{(i)}_t=\eta^{(i)}_t$. Let client $i$ collect a trajectory of length $T$. For every $t=0,1,\dots,T-1$, define the filtration Under Assumptions 1–4 there exist constants $s_\phi>0$ and $p_\phi\in(0,1)$ (as defined by musavi2024identification) such that, for every client $i$ and every unit vector $v\in\mathbb S^{n_\phi-1}$, Hence the regressor proc

Figures (6)

  • Figure 1: Federated learning framework for nonlinear dynamical system identification involving $M$ clients which are similar but non-identical in nature and a central server. In each global communication round $r$, client $C_i$ receives the global model $\theta_s$ performs local updates using its own trajectories data, and transmits the locally updated model $\theta_i$ back to the server. The server then aggregates the models to obtain an updated global model for the next round.
  • Figure 2: Estimation error versus the number of global iterations for the real-world nonlinear dynamical system of a pendulum using gradient descent (GD). Results illustrate the impact of varying the number of clients ($M$), the number of local samples per client ($N_i$), and the heterogeneity parameter ($\epsilon$), with each client performing $K_i = 1$ local updates at alearning rate of $10^{-2}$. Subfigures: (a) $N_i = 10$, $\epsilon = 0.01$; (b) $M = 10$, $\epsilon = 0.01$; (c) $M = 20$, $N_i = 10$.
  • Figure 3: Estimation error on synthetic data as a function of global iterations using gradient descent, evaluated across different client configurations of the number of clients ($M$), local dataset size per client ($N_i$), and heterogeneity parameter ($\epsilon$). In all cases, each client performs $K_i = 5$ local update steps with a fixed learning rate of $10^{-4}$. The following configurations are considered: (a) $N_i = 10$, $\epsilon = 0.1$ with varying $M$; (b) $M = 25$, $\epsilon = 0.1$ with varying $N_i$; (c) $M = 25$, $N_i = 25$ with varying $\epsilon$.
  • Figure 4: Comparison of estimation error versus $\sqrt{M}$ on (a) a real-world pendulum system and (b) synthetic data. The empirical results validate that, in low heterogeneity settings, the non-asymptotic convergence rate can be enhanced by increasing number of clients. (c) Impact of local updates $(K_i)$ on estimation error.
  • Figure 5: Estimation error vs. global iterations for the nonlinear pendulum system using mini-batch SGD (batch size $10$). Results illustrate the impact of varying the number of clients ($M$), the number of local samples per client ($N_i$), and the heterogeneity parameter ($\epsilon$), with each client performing $K_i = 2$ local updates at learning rate of $10^{-2}$. Subfigures: (a) $N_i = 10$, $\epsilon = 0.01$; (b) $M = 10$, $\epsilon = 0.01$; (c) $M = 20$, $N_i = 10$.
  • ...and 1 more figures

Theorems & Definitions (20)

  • Definition 1
  • Definition 2
  • Lemma 1: BMSB for open‑loop systems
  • proof
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Theorem 1: Finite‑sample error
  • proof
  • ...and 10 more