Table of Contents
Fetching ...

Non-linear correlations underlie linear response and causality

Gabriele Di Antonio, Gianni Valerio Vinci

Abstract

The inference of causal relationships among observed variables is a pivotal, longstanding problem in the scientific community. An intuitive method for quantifying these causal links involves examining the response of one variable to perturbations in another. The fluctuation-dissipation theorem elegantly connects this response to the correlation functions of the unperturbed system, thereby bridging the concepts of causality and correlation. However, this relationship becomes intricate in nonlinear systems, where knowledge of the invariant measure is required but elusive, especially in high-dimensional spaces. In this study, we establish a novel link between the Koopman operator of nonlinear stochastic systems and the response function. This connection provides an alternative method for computing the response function using generalized correlation functions, even when the invariant measure is unknown. We validate our theoretical framework by applying it to a nonlinear high-dimensional system amenable to exact solutions, demonstrating convergence and consistency with established results. Finally, we discuss a significant interplay between the resulting causal network and the relevant time scales of the system.

Non-linear correlations underlie linear response and causality

Abstract

The inference of causal relationships among observed variables is a pivotal, longstanding problem in the scientific community. An intuitive method for quantifying these causal links involves examining the response of one variable to perturbations in another. The fluctuation-dissipation theorem elegantly connects this response to the correlation functions of the unperturbed system, thereby bridging the concepts of causality and correlation. However, this relationship becomes intricate in nonlinear systems, where knowledge of the invariant measure is required but elusive, especially in high-dimensional spaces. In this study, we establish a novel link between the Koopman operator of nonlinear stochastic systems and the response function. This connection provides an alternative method for computing the response function using generalized correlation functions, even when the invariant measure is unknown. We validate our theoretical framework by applying it to a nonlinear high-dimensional system amenable to exact solutions, demonstrating convergence and consistency with established results. Finally, we discuss a significant interplay between the resulting causal network and the relevant time scales of the system.

Paper Structure

This paper contains 2 sections, 52 equations, 7 figures.

Figures (7)

  • Figure 1: (a), Various estimations of the response function in relation to the perturbing variable $y_3$ for a six-dimensional example system as described in \ref{['eq:toyexample']}. The solid line represents the ground truth, while the individual points denote the definitions of perturbation and the fluctuation correlation formula presented in \ref{['eq:MainResult']}, specifically for the first-order $\mathrm{H}_1$ and second-order $\mathrm{H}_2$ Hermite polynomial observables. Averages over $10^6$ experiment realizations for systems parameters $\sigma_x=0.1$, $\sigma_y=0.025$. (b-c), The average reconstruction error of response curves across all dimensions and time steps, estimated using root mean square error, for the same system depicted in (a). Curves are estimated by perturbation limit and from \ref{['eq:MainResult']} using observables $\psi_k(\mathbf x) = x_k$, named $\mathrm H_x$, and Hermite basis $\mathrm H_1$, $\mathrm H_2$. This analysis considers respectively (b) varying numbers of process realizations ($\sigma_y=0.0$) and (c) different levels of noise $\sigma_y$ ($10^6$ experiments). The shaded ribbon area illustrates the standard deviation resulting from $10$ complete analysis repetitions. The time resolution is set at $\mathrm{dt}=0.01 s$, perturbation amplitude $\epsilon=0.01$, with stochastic Heun method for integration.
  • Figure 2: In the top row, we show the stationary distribution for system (\ref{['eq:GradSystem']}) in panel (a), (\ref{['eq:TanhSystem']}) in panel (b), and (\ref{['eq:Stuart-Landau']}) in panel (c). In the bottom row, the response functions (in black) are computed from the time-dependent solution of the Fokker-Planck equation, while the scatter points represent the result of equation (\ref{['eq:MainResult']}) for different basis functions $H_1$, $H_2$, and $H_3$, as shown in Fig. \ref{['fig:1']}. The parameters are $\alpha=0.1$, $a=1$, $b=-0.3$, $J_{11}=0.9$, $J_{12}=0.5$, $J_{21}=0.3$, $J_{22}=0.6$, and $\sigma=0.3$.
  • Figure 3: (a) Weight coefficients $\beta_n$ for the Stuart-Landau system \ref{['eq:Stuart-Landau']}. Dominant terms (highlighted in red) were selected as basis functions for the response function estimation. (b) Comparison between the estimated response function (derived from the reduced basis) and the exact analytical result.
  • Figure 4: (a) Normalized residual error for variables $x$ and $y$ in the gradient system \ref{['eq:GradSystem']}, plotted against dictionary size. (b) Estimated entropy from eq. \ref{['eq:ConstrainEntropy']} as a function of dictionary size. The dashed black line denotes the exact theoretical entropy computed from the stationary distribution.
  • Figure 5: (a) Sorted absolute values of coefficients $|\beta_n|$ for the stochastic Lorenz system. (b) Relative error between exact response and $n$-element dictionary estimates. (c) Residual error response approximations. (d) Difference between estimated entropy values as in eq. (\ref{['eq:dS0']}). (e) Comparison of exact response (black curve) and approximated responses (colored scatter points) across three dictionary sizes.
  • ...and 2 more figures