Table of Contents
Fetching ...

A Computational Framework for Modeling Emergence of Color Vision in the Human Brain

Atsunobu Kotani, Ren Ng

TL;DR

This work tackles how the human cortex could infer the true color dimensionality from optic nerve signals by proposing a computational framework that jointly simulates the eye and cortex. Color is represented as an $N$-dimensional cortical space, with a learnable dimensionality $K$ that emerges from self-supervised temporal prediction of retinal signal fluctuations during fixational eye movements. The framework combines a biologically grounded eye model, a three-part cortical decoding/translation/re-encoding pipeline with neural buckets for retinal encoding properties, and quantitative (CMF-SIM) and qualitative (NS) measures to demonstrate mono-, di-, tri-, and tetrachromatic emergence, including gene-therapy-like boosts from $3$D to $4$D. The results offer a novel perspective on color perception as an emergent property of cortical inference, with potential implications for computational imaging, neurobiology, and vision augmentation.

Abstract

It is a mystery how the brain decodes color vision purely from the optic nerve signals it receives, with a core inferential challenge being how it disentangles internal perception with the correct color dimensionality from the unknown encoding properties of the eye. In this paper, we introduce a computational framework for modeling this emergence of human color vision by simulating both the eye and the cortex. Existing research often overlooks how the cortex develops color vision or represents color space internally, assuming that the color dimensionality is known a priori; however, we argue that the visual cortex has the capability and the challenge of inferring the color dimensionality purely from fluctuations in the optic nerve signals. To validate our theory, we introduce a simulation engine for biological eyes based on established vision science and generate optic nerve signals resulting from looking at natural images. Further, we propose a bio-plausible model of cortical learning based on self-supervised prediction of optic nerve signal fluctuations under natural eye motions. We show that this model naturally learns to generate color vision by disentangling retinal invariants from the sensory signals. When the retina contains N types of color photoreceptors, our simulation shows that N-dimensional color vision naturally emerges, verified through formal colorimetry. Using this framework, we also present the first simulation work that successfully boosts the color dimensionality, as observed in gene therapy on squirrel monkeys, and demonstrates the possibility of enhancing human color vision from 3D to 4D.

A Computational Framework for Modeling Emergence of Color Vision in the Human Brain

TL;DR

This work tackles how the human cortex could infer the true color dimensionality from optic nerve signals by proposing a computational framework that jointly simulates the eye and cortex. Color is represented as an -dimensional cortical space, with a learnable dimensionality that emerges from self-supervised temporal prediction of retinal signal fluctuations during fixational eye movements. The framework combines a biologically grounded eye model, a three-part cortical decoding/translation/re-encoding pipeline with neural buckets for retinal encoding properties, and quantitative (CMF-SIM) and qualitative (NS) measures to demonstrate mono-, di-, tri-, and tetrachromatic emergence, including gene-therapy-like boosts from D to D. The results offer a novel perspective on color perception as an emergent property of cortical inference, with potential implications for computational imaging, neurobiology, and vision augmentation.

Abstract

It is a mystery how the brain decodes color vision purely from the optic nerve signals it receives, with a core inferential challenge being how it disentangles internal perception with the correct color dimensionality from the unknown encoding properties of the eye. In this paper, we introduce a computational framework for modeling this emergence of human color vision by simulating both the eye and the cortex. Existing research often overlooks how the cortex develops color vision or represents color space internally, assuming that the color dimensionality is known a priori; however, we argue that the visual cortex has the capability and the challenge of inferring the color dimensionality purely from fluctuations in the optic nerve signals. To validate our theory, we introduce a simulation engine for biological eyes based on established vision science and generate optic nerve signals resulting from looking at natural images. Further, we propose a bio-plausible model of cortical learning based on self-supervised prediction of optic nerve signal fluctuations under natural eye motions. We show that this model naturally learns to generate color vision by disentangling retinal invariants from the sensory signals. When the retina contains N types of color photoreceptors, our simulation shows that N-dimensional color vision naturally emerges, verified through formal colorimetry. Using this framework, we also present the first simulation work that successfully boosts the color dimensionality, as observed in gene therapy on squirrel monkeys, and demonstrates the possibility of enhancing human color vision from 3D to 4D.
Paper Structure (37 sections, 5 equations, 14 figures, 1 table, 1 algorithm)

This paper contains 37 sections, 5 equations, 14 figures, 1 table, 1 algorithm.

Figures (14)

  • Figure 1: Overview of our proposed framework for modeling the emergence of human color vision. Our simulation engine of biological eyes converts a scene stimulus (hyperspectral image) to a stream of optic nerve signals (Section \ref{['sec:method_eye']} & Video 0:30). We simulate cortical learning purely from these optic nerve signals (Section \ref{['sec:method_learning']}) and show the emergence of color vision. We show how to analyze the emergent neural color quantitatively with Color Matching Function test Simulator ($\mathit{CMF\textrm{-}SIM}$) and qualitatively with Neural Scope ($\mathit{NS}$) (Section \ref{['sec:method_representation']}).
  • Figure 2: Overview of our simulation engine of biological eyes. 1. This engine takes a scene stimulus as an input, processes it through a "textbook" model of eye motion and retinal neural circuitry, to generate a stream of optic nerve signals. 2. This engine accepts custom eye and retina parameters. 3. It allows visualization of neural signals in steps (A-F), illustrating the progressive entanglement of scene imagery with retinal properties. 4. Visualization of changing one of the input parameters, the number of cone cell types -- showing that the signals become noisier as the number increases.
  • Figure 3: Overview of our hypothesized cortical learning mechanism and exclusive study of the learning behavior of the cortical model with trichromatic retina. 1. Given the stream of optic nerve signals as the only input data, the cortical model aims to predict the next ONS from the current one with 3 learnable functions, decoder $\Phi$, translation operator $\Omega$ and re-encoder $\Psi$. 2. Prediction error decreases as learning progresses, converging after 100K learning steps. 3. During learning, the color dimensionality of the internal percepts transition from 1D, 2D to 3D, formally measured by $\mathit{CMF\textrm{-}SIM}$ and visualized by $\mathit{NS}$ (Fig. \ref{['fig:4']} & Section \ref{['sec:method_representation']}). 4. The cortical model infers the retinal invariant properties during learning: cell positions $\mathrm{P}$ (higher density in fovea), cone cell types $\mathrm{C}$, and lateral inhibition weights $\mathrm{W}$ (center-surround receptive field).
  • Figure 4: Overview of our measurement methods of emergent color dimensionality, Color Matching Function Test Simulator ($\mathit{CMF\textrm{-}SIM}$) & Neural Scope ($\mathit{NS}$). 1. $\mathit{CMF\textrm{-}SIM}$ determines the minimum number of primary colors needed to match any target color by iteratively updating coefficients to minimize perceptual error. 2. $\mathit{NS}$ visualizes visual percepts independently of the cortical learning loop, optimized as a learnable $N\times 3$ matrix to minimize projection error to the target RGB image. 3. Example $\mathit{CMF\textrm{-}SIM}$ output for a trichromat retina-trained cortical model shows matching errors for 400–700nm spectral light converging to zero with three primaries, confirming 3D color vision.
  • Figure 5: Results of simulating emergence of color vision from various retinas, with analysis of learned color dimensionality using qualitative visualization ($\mathit{NS}$) and formal methods ($\mathit{CMF\textrm{-}SIM}$). 1. Cortical models trained with dataset generated with retinas containing 1, 2, 3, 4 cone types result in mono-, di-, tri, tetrachromatic color vision, respectively. 2. Qualitative color of dichromat variants is consistent with known vision science on color vision deficiency. 3. Control experiment with a trichromat retina, but with cortical learning deliberately removed: $\mathit{CMF\textrm{-}SIM}$ measures color as 1-D, highlighting that cortical learning is necessary for emergence of correct color dimensionality.
  • ...and 9 more figures