Table of Contents
Fetching ...

Inference in conditioned dynamics through causality restoration

Alfredo Braunstein, Giovanni Catania, Luca Dall'Asta, Matteo Mariani, Anna Paola Muntoni

TL;DR

The Causal Variational Approach is proposed, as an approximate method to generate independent samples from a conditioned distribution that provides an effective unconditioned distribution that is easy to interpret and can be applied virtually to any dynamics.

Abstract

Computing observables from conditioned dynamics is typically computationally hard, because, although obtaining independent samples efficiently from the unconditioned dynamics is usually feasible, generally most of the samples must be discarded (in a form of importance sampling) because they do not satisfy the imposed conditions. Sampling directly from the conditioned distribution is non-trivial, as conditioning breaks the causal properties of the dynamics which ultimately renders the sampling procedure efficient. One standard way of achieving it is through a Metropolis Monte-Carlo procedure, but this procedure is normally slow and a very large number of Monte-Carlo steps is needed to obtain a small number of statistically independent samples. In this work, we propose an alternative method to produce independent samples from a conditioned distribution. The method learns the parameters of a generalized dynamical model that optimally describe the conditioned distribution in a variational sense. The outcome is an effective, unconditioned, dynamical model, from which one can trivially obtain independent samples, effectively restoring causality of the conditioned distribution. The consequences are twofold: on the one hand, it allows us to efficiently compute observables from the conditioned dynamics by simply averaging over independent samples. On the other hand, the method gives an effective unconditioned distribution which is easier to interpret. The method is flexible and can be applied virtually to any dynamics. We discuss an important application of the method, namely the problem of epidemic risk assessment from (imperfect) clinical tests, for a large family of time-continuous epidemic models endowed with a Gillespie-like sampler. We show that the method compares favorably against the state of the art, including the soft-margin approach and mean-field methods.

Inference in conditioned dynamics through causality restoration

TL;DR

The Causal Variational Approach is proposed, as an approximate method to generate independent samples from a conditioned distribution that provides an effective unconditioned distribution that is easy to interpret and can be applied virtually to any dynamics.

Abstract

Computing observables from conditioned dynamics is typically computationally hard, because, although obtaining independent samples efficiently from the unconditioned dynamics is usually feasible, generally most of the samples must be discarded (in a form of importance sampling) because they do not satisfy the imposed conditions. Sampling directly from the conditioned distribution is non-trivial, as conditioning breaks the causal properties of the dynamics which ultimately renders the sampling procedure efficient. One standard way of achieving it is through a Metropolis Monte-Carlo procedure, but this procedure is normally slow and a very large number of Monte-Carlo steps is needed to obtain a small number of statistically independent samples. In this work, we propose an alternative method to produce independent samples from a conditioned distribution. The method learns the parameters of a generalized dynamical model that optimally describe the conditioned distribution in a variational sense. The outcome is an effective, unconditioned, dynamical model, from which one can trivially obtain independent samples, effectively restoring causality of the conditioned distribution. The consequences are twofold: on the one hand, it allows us to efficiently compute observables from the conditioned dynamics by simply averaging over independent samples. On the other hand, the method gives an effective unconditioned distribution which is easier to interpret. The method is flexible and can be applied virtually to any dynamics. We discuss an important application of the method, namely the problem of epidemic risk assessment from (imperfect) clinical tests, for a large family of time-continuous epidemic models endowed with a Gillespie-like sampler. We show that the method compares favorably against the state of the art, including the soft-margin approach and mean-field methods.
Paper Structure (21 sections, 56 equations, 7 figures, 1 algorithm)

This paper contains 21 sections, 56 equations, 7 figures, 1 algorithm.

Figures (7)

  • Figure 1: Panel (a) Unconditioned homogeneous random walk on a one-dimensional lattice. Time is reported on the vertical axis (up to $T=40$) and the spatial coordinate $x$ is on the horizontal axis. Panel (b) Some trajectories are sampled from the unconditioned homogeneous distribution. The black (red) ones (do not) satisfy the constraints, i.e. they (do not) avoid the black horizontal barriers. The fraction of feasible trajectories among a given pool can be numerically estimated, and it approaches $10^{-6}$. In other words, only one of a million trajectories sampled from the unconditioned distribution satisfies the constraint. Panel (c) The distribution of the trajectories sampled from the CVA distribution. The color of each pixel indicates the probability for a trajectory to visit the corresponding state at a specific time.
  • Figure 2: Area under the ROC (AUC) as a function of the number of observations for the risk assessment problem, i.e. $t^{\star} = T$, in panel (a), and for the patient-zero problem, $t^{\star} = 0$, panel (b). The simulated contact graph is a proximity network with average connectivity $2.2/N$. For both simulations in panels (a) and (b), the total number of individuals is $N=50$, the probability of being the zero patient is set to $\gamma=1/N$, and the infection rate is $\lambda=0.1$. For each epidemic realization, the inference is performed for an increasing number of noiseless observations (here $p_{\rm{FNR}}=0$) at time $t_{\rm{obs}} = T$. Thick lines and shaded areas indicate the averages and the standard errors computed over $40$ different instances.
  • Figure 3: AUC associated with the prediction of the infected individuals, for the Causal Variational Approach (CVA), Belief Propagation (sib) and SoftMargin (soft), and MCMC (MC) as a function of time during the epidemic propagation of a SI model on several instances of dynamic contact network generated using the OpenABM model ferretti_contact_tracing (in panel (a) $N = 2000$, in (c) $N = 1000$) and the StEM in Ref. lorch2022quantifying (panels (b) and (d)) for $N = 904$. The infection rate is set to $\lambda=0.15$ for the latter and $\lambda = 0.02$ for the former; observations are noiseless in both cases. For panels (c) and (d), observations are performed at the last time of the dynamics, i.e. $t_{\rm{obs}} = T$. For the results in panels (a) and (b) observation times are extracted uniformly in the range $\left[1, T \right]$; at each observation time $t_{\rm obs}$, infected nodes are observed with a biased probability equal to $1.1\times N_{I}\left(t_ {\rm obs}\right)/N$ where $N_{I}\left(t_{\rm obs}\right)$ is the number of infected individuals at time $t_{\rm obs}$ and $N$ is the total number of individuals. The total number of observations is $n_{\rm{obs}} = N \cdot 0.1$ for OpenABM and $n_{\rm{obs}} = 100$ for the StEM.
  • Figure 4: Panel (a) Heat map of the free energy ($F := -\log \mathbb{P}(\mathcal{O})$) computed at the convergence of CVA as a function of the assumed hyperparameters of the generative SI model. The experiment is performed on a proximity graph with $N=50$ individuals and density $\rho=2/N$; the epidemic model is characterized by the zero-patient probability $\gamma^*=1/N$ and the infection rate $\lambda^*=0.1$, shown here as a green star. We perform a large number of observations ($n_{\rm{obs}}=2N$) at uniformly randomly distributed times. As expected, the lowest values of this free energy are concentrated around the exact value $(\gamma^*,\lambda^*)$. The oriented paths (white arrows) represent the convergence towards the minimum of $-\log \mathbb{P}[\mathcal{O}]$ obtained by performing a gradient descent algorithm over the hyperparameters starting from three different initial points in the plane $(\gamma,\lambda)$. Panel (b) Scatter plot of inferred values for the infection probability against the ground truth. In these experiments, we fix and assume to know the zero patient probability $\gamma=1/N$ while the infection parameter $\lambda$ is varied. For each $\lambda$ an epidemic simulation is performed and $n_{\rm{obs}}=10N$ observations are taken at uniformly randomly distributed times.
  • Figure 5: Effects of model reduction on inferential performances and generative capabilities. The numerical experiments are performed on a proximity graph with $N=100$ individuals and density $2.2/N$. The observed epidemic realizations are generated using an SEIR model with $\gamma=1/N$, $\lambda=0.3$ (panels (a) and (b)) and $0.15$ (panel (c)), latency delay $\nu=0.5$ and recovery delay $\mu=0.1$. Panel (a) Values of the AUC as a function of time obtained using the CVA in two observation regimes (when the number of observations is $n_{obs}=N/10$ and $n_{obs}=N/2$), with the three different inferred posterior distributions: an SEIR model with known hyperparameters (green diamonds), an SEIR model with unknown hyperparameters (blue circles), and a SI model with unknown hyperparameters (red squares). Shaded areas represent the error around the average value, computed using $22$ instances. Panels (b) and (c) The average fraction of infected individuals as a function of time estimated using the correct SEIR prior model (green diamonds), an SEIR prior with the inferred hyperparameters (blue circles), and a SI prior model with the inferred hyperparameters (red squares). The regimes shown correspond to unbiased observations (center, for $\lambda=0.3$), and to observations preferentially sampled from large outbreaks (right, for $\lambda=0.15$). The black curves represent the same quantity computed from the observed epidemic realizations. Shaded areas represent the standard error computed from 40 realizations of the dynamics.
  • ...and 2 more figures