Finite integration time can shift optimal sensitivity away from criticality

Sahel Azizpour; Viola Priesemann; Johannes Zierenberg; Anna Levina

Finite integration time can shift optimal sensitivity away from criticality

Sahel Azizpour, Viola Priesemann, Johannes Zierenberg, Anna Levina

TL;DR

This work analytically and computationally demonstrates how the optimal tuning of a recurrent neural network is determined given a finite integration time and finds networks attain different sensitivities depending on the available time.

Abstract

Sensitivity to small changes in the environment is crucial for many real-world tasks, enabling living and artificial systems to make correct behavioral decisions. It has been shown that such sensitivity is maximized when a system operates near the critical point of a phase transition. However, proximity to criticality introduces large fluctuations and diverging timescales. Hence, to leverage the maximal sensitivity, it would require impractically long integration periods. Here, we analytically and computationally demonstrate how the optimal tuning of a recurrent neural network is determined given a finite integration time. Rather than maximizing the theoretically available sensitivity, we find networks attain different sensitivities depending on the available time. Consequently, the optimal dynamic regime can shift away from criticality when integration times are finite, highlighting the necessity of incorporating finite-time considerations into studies of information processing.

Finite integration time can shift optimal sensitivity away from criticality

TL;DR

Abstract

Paper Structure (10 sections, 22 equations, 4 figures)

This paper contains 10 sections, 22 equations, 4 figures.

Methods
Data & Code availability
Acknowledgements
Supplementary Information

Figures (4)

Figure 1: Fluctuations in network activity can lead to unreliable input reconstruction.a) Illustration of the neural reservoir: A subset of recurrently connected neurons with the largest eigenvalue of the connectivity matrix equal $\lambda$ receives Poisson spikes with rate $h$; the output integrates spikes of a subset of neurons with a timescale $T$ and is subject to noise $\eta$. b) Temporal evolution of output for different $\lambda$ and $T$ comparing the responses to two different inputs $h_1 < h_2$ with mean responses (marked by the arrows on the $y$-axis) fixed for all panels at the same values with $\langle o^T\rangle(h_1) < \langle o^T\rangle(h_2)$. Solid lines are the pure output response of the network, and opaque areas include noise. Gray regions highlight the times when the output responses differ from the order of their mean responses. The minimal discrimination error of the two underlying distributions is given by $\varepsilon$, cf. Eq. \ref{['Eq:MDE']}. c) Response curve indicating output values to logarithmic input rates. The solid black line shows the mean response, and the gray shading indicates the strength of fluctuations stemming from neural activity and output noise. Inputs are called discriminable when their output distributions have a sufficiently small overlap (see blue areas on the right). The first inputs that are discriminable from zero and full activity mark the dynamic range (black dashed lines). From these, we can construct sets of discriminable inputs marked by the black triangles (see text for details).
Figure 2: Dynamical regime for optimal information transmission depends on the readout integration timescale. Discriminating input rates from a noisy output (cf. Fig. \ref{['fig1']}), we observe that a) the number of discriminable inputs as well as b) the dynamic range are maximal for sub-critical networks ($\lambda<1$). With increasing timescale $T$, the maximum $\lambda^\ast$ moves closer to the critical point as demonstrated in the insets. Parameters: $N=10^4$, $K=10^2$, $\mu=0.2$, $\nu=0.2$, $\sigma=10^{-2}$, $\varepsilon=10^{-1}$.
Figure 3: Workflow to calculate $\varepsilon$-discriminable inputs. In the first step (a-c), we obtain the distribution $P(a_T|h)$ of pure output activity in a reservoir (with parameter $\lambda$) subject to an input $h$. (a) For $T\to\infty$, these distributions become $\delta$-distributions that we calculate using our mean-field approximation. (b) For finite $T$, we perform many numerical simulations and fit a Beta distribution to the data using maximum likelihood estimates (orange examples obtained for $T=100$). We then use these fit results to train a deep neural network as a general function approximation that interpolates the fit parameters $(\alpha,\beta)$ for all simulation parameters $(\lambda, h, T)$. (c) In the limit $T\to 0$, we obtain analytical results by solving the Fokker-Planck equation that is in good agreement with corresponding numerical data for $T=1$ (yellow histograms). (d) In the next step, we obtain the distribution of noisy output responses $P(o^T|h)$ by a convolution of $P(a^T|h)$ with a Gaussian $\mathcal{N}(0,\sigma^2)$ of small variance $\sigma^2$. This step allows i) to connect to previous mean-field results for $T\to\infty$kinouchi_optimal_2006 and ii) circumvents numerical intricacies for finite $N$ at the boundaries. The example compares beta distributions from the neural network interpolations (orange) with their corresponding noisy distributions (gray), which mostly differ at the boundaries. (e-f) In the last step, we determine two sets of discriminable inputs that can be discriminated from reference distributions for vanishing input (left Gaussian distribution, black) and diverging input (right Gaussian distribution, black). For this, we start from the left and right references and perform iterative bisection searches in $h$ to find input values whose response distributions overlap exactly $\varepsilon$ with the previous one. The dynamic range is calculated from the smallest and largest inputs (marked green). The number of discriminable inputs is obtained as the average size of the sets. Examples are shown for $\lambda=0.999$, $\varepsilon=0.1$, $\sigma=0.01$. Example distributions for $h\in\left[5.6\cdot 10^{-5}, 1.8\cdot 10^{-3}, 5.6\cdot 10^{-3}, 1.8\cdot 10^{-2}, 3.2\cdot 10^{-1} \right]$
Figure S1: Number of discriminable input (left) and dynamic range (right) analogous to Fig. 2 main text but with $N^\text{out}=N$ to better compare to analytic solution. Notice that in this case the numerical results interpolate better between the analytic limits.

Finite integration time can shift optimal sensitivity away from criticality

TL;DR

Abstract

Finite integration time can shift optimal sensitivity away from criticality

Authors

TL;DR

Abstract

Table of Contents

Figures (4)