How does the brain compute with probabilities?

Ralf M. Haefner; Jeff Beck; Cristina Savin; Mehrdad Salmasi; Xaq Pitkow

How does the brain compute with probabilities?

Ralf M. Haefner, Jeff Beck, Cristina Savin, Mehrdad Salmasi, Xaq Pitkow

TL;DR

A unified language for defining competing hypotheses for probabilistic computations is provided and the fundamentals of three prominent proposals for probabilistic computations are explained, and similarities and differences in that common language are described.

Abstract

This perspective piece is the result of a Generative Adversarial Collaboration (GAC) tackling the question `How does neural activity represent probability distributions?'. We have addressed three major obstacles to progress on answering this question: first, we provide a unified language for defining competing hypotheses. Second, we explain the fundamentals of three prominent proposals for probabilistic computations -- Probabilistic Population Codes (PPCs), Distributed Distributional Codes (DDCs), and Neural Sampling Codes (NSCs) -- and describe similarities and differences in that common language. Third, we review key empirical data previously taken as evidence for at least one of these proposal, and describe how it may or may not be explainable by alternative proposals. Finally, we describe some key challenges in resolving the debate, and propose potential directions to address them through a combination of theory and experiments.

How does the brain compute with probabilities?

TL;DR

Abstract

Paper Structure (94 sections, 19 equations, 8 figures, 1 table)

This paper contains 94 sections, 19 equations, 8 figures, 1 table.

Introduction
Why probability in the brain?
What makes a 'good' neural representation?
What does it mean for the brain to represent probabilities?
What is z? What variables are the probabilities about?
What is o? What evidence are the probabilities conditioned on?
What happened to s? What about the experimentally defined task stimulus?
Primary visual cortex (V1):
Medial temporal area (MT):
Hippocampus (CA1):
What is ? What is the posterior probability that the brain would like to infer?
What is q(z|o)? What is the brain's approximate inference?
What is r? Which neural properties do the representing?
What is the relationship between r and q? How does neural activity represent probabilities?
What are the dynamics of probabilistic computations?
...and 79 more sections

Figures (8)

Figure 1: Relationships between key quantities for probabilistic inference. A: Schematic of the different elements of Bayesian inference and its neural implementation. Not to be interpreted as a graphical model! B: The computational goal of a Bayesian brain is to infer the brain's latent variables ${\mathbf{z}}$ from observations ${\boldsymbol{o}}$. C: Inferential dynamics at the algorithmic level, for a static problem. Latent causes in the world generate observations, which the brain interprets through approximate inference dynamics in terms of its own latents ${\mathbf{z}}$, eventually producing an action or decision. In theoretical models of inference, neural activity is a consequence of these algorithmic dynamics. D: In reality, the physical mechanism or implementation of this process has a different causal diagram without the interpretable approximate posteriors, and the inferential dynamics are merely an abstract interpretation of the activity in the biophysical system.
Figure 2: Examples of how a posterior distribution $q({\mathbf{z}}|{\boldsymbol{o}})$ ( a) could be mapped to neural activity. The distribution could be parameterized ( b) and these parameters ( c) could be mapped to neural activity ( e). Alternatively, samples from the posterior ( d) could determine neural responses. It is also possible to interpolate between these options, sampling the parameters lange2022interpolating (Adapted from lange2022task.)
Figure 3: For a static posterior, $p({\mathbf{z}}|{\boldsymbol{o}})$, the approximate posteriors, $q_t({\mathbf{z}}|{\boldsymbol{o}})$ generally change over time. a: Parametric codes allow parameters ${\boldsymbol{\theta}}_t$ to depend on time. b: Dynamics can produce sequences of samples, which when accumulated gradually fill out the posterior. c: Dynamics can also produce sequences of sampled parameters lange2022interpolating rather than samples of latent variables.
Figure 4: Distributed Distributional Codes (DDC): representation and decoding. (a) Five DDC encoding functions are assumed to represent the distribution. In the absence of uncertainty, e.g. $p({\mathbf{z}}|{\boldsymbol{o}}) = \delta({\mathbf{z}} - {\mathbf{z}}_{\boldsymbol{o}})$, the values of the encoding functions at ${\mathbf{z}}_{\boldsymbol{o}}$ represent the deterministic value of the latent variable (filled circles and the bar plot in b). ( c) Under uncertainty, we illustrate the exact posterior distribution, $p({\mathbf{z}}|{\boldsymbol{o}})$, as a mixture of two Gaussian distributions (gray). The DDC representation is based on the expected values of encoding functions under the full posterior distribution ( d). The approximate posterior, $q({\mathbf{z}}|{\boldsymbol{o}})$, that is decoded from the representation depends on additional decoding choices, two of which are shown here: Dashed black line: the maximum entropy distribution derived from the DDC values in d. Solid black line: the sparsity-regularized decoding of the belief.
Figure 5: Illustration of neural sampling codes using both continuous and binary latents.
...and 3 more figures

How does the brain compute with probabilities?

TL;DR

Abstract

How does the brain compute with probabilities?

Authors

TL;DR

Abstract

Table of Contents

Figures (8)