Deep Learning for Visual Neuroprosthesis
Peter Beech, Shanshan Jia, Zhaofei Yu, Jian K. Liu
TL;DR
Deep Learning for Visual Neuroprosthesis surveys how visual information is encoded from the retina to the cortex and how deep learning models can decode and reconstruct visual scenes from neural activity. It covers CNNs for feature extraction, RNNs for temporal dynamics, and generative approaches (VAEs, GANs) as well as end-to-end spike-to-image frameworks like Spike Image Decoder (SID) to advance visual neuroprostheses. Key contributions include SID achieving end-to-end decoding from retinal spikes to images, DCGAN-based reconstructions from fMRI, and semi-supervised Bayesian models enabling robust category decoding with limited data. Together, these approaches illustrate how computational models can reveal visual coding principles and guide the design of next-generation retinal and broader visual neuroprostheses.
Abstract
The visual pathway involves complex networks of cells and regions which contribute to the encoding and processing of visual information. While some aspects of visual perception are understood, there are still many unanswered questions regarding the exact mechanisms of visual encoding and the organization of visual information along the pathway. This chapter discusses the importance of visual perception and the challenges associated with understanding how visual information is encoded and represented in the brain. Furthermore, this chapter introduces the concept of neuroprostheses: devices designed to enhance or replace bodily functions, and highlights the importance of constructing computational models of the visual pathway in the implementation of such devices. A number of such models, employing the use of deep learning models, are outlined, and their value to understanding visual coding and natural vision is discussed.
