Machine learning of network inference enhancement from noisy measurements

Kai Wu; Yuanyuan Li; Jing Liu

Machine learning of network inference enhancement from noisy measurements

Kai Wu, Yuanyuan Li, Jing Liu

TL;DR

This work presents an elegant and efficient model-agnostic framework tailored to amplify the capabilities of model-based and model-free network inference models for real-world cases, and showcases substantial performance augmentation under varied noise types.

Abstract

Inferring networks from observed time series data presents a clear glimpse into the interconnections among nodes. Network inference models, when dealing with real-world open cases, especially in the presence of observational noise, experience a sharp decline in performance, significantly undermining their practical applicability. We find that in real-world scenarios, noisy samples cause parameter updates in network inference models to deviate from the correct direction, leading to a degradation in performance. Here, we present an elegant and efficient model-agnostic framework tailored to amplify the capabilities of model-based and model-free network inference models for real-world cases. Extensive experiments across nonlinear dynamics, evolutionary games, and epidemic spreading, showcases substantial performance augmentation under varied noise types, particularly thriving in scenarios enriched with clean samples.

Machine learning of network inference enhancement from noisy measurements

TL;DR

Abstract

Paper Structure (2 equations, 4 figures)

This paper contains 2 equations, 4 figures.

Figures (4)

Figure 1: Graphical illustration of MANIE framework based on noisy time series. $\bm{A}^\ast$ or $\bm{v}^\ast$ represents the value of $\bm{A}$ or $\bm{v}$ when the loss function takes its minimum. “global noise” refers to all samples being contaminated by noise, while “local noise” indicates that only a portion of the samples is contaminated. “clean” refers to untainted sample data.
Figure 2: AUC scores for different methods for various scenarios. The negative logarithm is taken on the AUC in order to convert it into an unrestricted indicator with a more intuitive interpretation. Smaller $-\log_{2}(\text{AUC})$ indicates superior inference performance. (a) Enhancing Model-free network inference methods. "Example" is a 5$\times$5 example network, the specific form of which is described in the supplementary material. Insilico_10 and Insilico_100 pertain to networks of sizes 10 and 100 in DREAM4 challenge 5, where the suffix represents different networks. ZK_Kuramoto1 illustrates the dynamics of Zachary’ s karate club network (ZK) 45 under a Kuramoto-1 oscillator. Similarly, Random_Kuramoto1 and Random_Kuramoto2 depict the dynamics of directed random networks under Kuramoto-1 and Kuramoto-2 oscillators 46, respectively. The term "_0" denotes noise-free time series. The suffix "_1" indicates the addition of white Gaussian noise with a signal-to-noise ratio (SNR) of 10 to the global time series. For "_2", white Gaussian noise is added at each timestep with a probability of 0.5 and an SNR of 10 dB. In "_3", white Gaussian noise is added at each timestep with a probability of 0.5 and an SNR ranging from 0 to 10, randomly generated. Enhancing model-based inference from EG data using MANIE with embedded (b) STRidge or (c) LASSO. Here, we employ ZK, football 57, and dolphin 58 networks. Additionally, we utilize four synthetic networks with 40 nodes each: Erdős-Rényi random network (ER) 53, Barabási-Albert scale-free network (BA) 54, Newman-Watts small-world network (NW) 55, and Watts-Strogatz small-world network (WS) 56. For these cases, (i) "_1": random noise with an amplitude of 10 added to each time step's recordings, with a probability of 0.5; (ii) "_2": random noise with random amplitudes less than 10, added to each time step's recordings with a probability of 0.5; (iii) "_3": random noise added to the global data, magnitudes set at 1, 5, and 10 respectively, with the final outcomes averaged. Enhancing model-based inference of propagation networks for (d) SIS dynamics or (e) CP dynamics using MANIE with embedded CS. We introduce two common types of noise: (i) "_1", where 10% of nodes in the binary time series are missing; (ii) "_2", wherein each record is misremembered with a probability of 0.1, causing a reversal between 0 and 1.
Figure 3: demonstrates the influence of noise levels on MANIE in the context of ZK network inference. AUC scores for model-based inference from EG data using MANIE with embedded STRidge, (a) varying by the amplitude of random noise while maintaining a 0.5 fraction of noisy data, and (b) varying by the fraction of noisy data while keeping the amplitude of random noise at 10. AUC scores for model-free network inference, (c) dependent on the signal-to-noise ratio (SNR)/dB and (d) influenced by the fraction of noisy data. The EG process underwent 6 repetitions, each encompassing 10 rounds. Model-free data were aggregated through brief transient time series spanning 5 time steps, originating from 50 distinct initial conditions using a Kuramoto-1 oscillator.
Figure 4: illustrates the weight vector $\bm{v}$ optimization. $\bm{Y}$ corresponds to noisy time series. Clean data are denoted by blue marks, while red marks indicate data affected by noise. (a) Model-free inference. (b) Model-based inference.