Table of Contents
Fetching ...

A neuromorphic model of the insect visual system for natural image processing

Adam D. Hines, Karin Nordström, Andrew B. Barron

TL;DR

A bio-inspired vision model is introduced that captures principles of the insect visual system to transform dense visual input into sparse, discriminative codes, trained using a fully self-supervised contrastive objective, enabling representation learning without labeled data and supporting reuse across tasks without reliance on domain-specific classifiers.

Abstract

Insect vision supports complex behaviors including associative learning, navigation, and object detection, and has long motivated computational models for understanding biological visual processing. However, many contemporary models prioritize task performance while neglecting biologically grounded processing pathways. Here, we introduce a bio-inspired vision model that captures principles of the insect visual system to transform dense visual input into sparse, discriminative codes. The model is trained using a fully self-supervised contrastive objective, enabling representation learning without labeled data and supporting reuse across tasks without reliance on domain-specific classifiers. We evaluated the resulting representations on flower recognition tasks and natural image benchmarks. The model consistently produced reliable sparse codes that distinguish visually similar inputs. To support different modelling and deployment uses, we have implemented the model as both an artificial neural network and a spiking neural network. In a simulated localization setting, our approach outperformed a simple image downsampling comparison baseline, highlighting the functional benefit of incorporating neuromorphic visual processing pathways. Collectively, these results advance insect computational modelling by providing a generalizable bio-inspired vision model capable of sparse computation across diverse tasks.

A neuromorphic model of the insect visual system for natural image processing

TL;DR

A bio-inspired vision model is introduced that captures principles of the insect visual system to transform dense visual input into sparse, discriminative codes, trained using a fully self-supervised contrastive objective, enabling representation learning without labeled data and supporting reuse across tasks without reliance on domain-specific classifiers.

Abstract

Insect vision supports complex behaviors including associative learning, navigation, and object detection, and has long motivated computational models for understanding biological visual processing. However, many contemporary models prioritize task performance while neglecting biologically grounded processing pathways. Here, we introduce a bio-inspired vision model that captures principles of the insect visual system to transform dense visual input into sparse, discriminative codes. The model is trained using a fully self-supervised contrastive objective, enabling representation learning without labeled data and supporting reuse across tasks without reliance on domain-specific classifiers. We evaluated the resulting representations on flower recognition tasks and natural image benchmarks. The model consistently produced reliable sparse codes that distinguish visually similar inputs. To support different modelling and deployment uses, we have implemented the model as both an artificial neural network and a spiking neural network. In a simulated localization setting, our approach outperformed a simple image downsampling comparison baseline, highlighting the functional benefit of incorporating neuromorphic visual processing pathways. Collectively, these results advance insect computational modelling by providing a generalizable bio-inspired vision model capable of sparse computation across diverse tasks.
Paper Structure (16 sections, 30 equations, 7 figures)

This paper contains 16 sections, 30 equations, 7 figures.

Figures (7)

  • Figure 1: Overview of the structure and training of the vision system.A Activation (Act.) with homeostatic (homeo.) and lateral inhibition (lat. inh.) normalization layers were implemented for B each layer of the CNN, emulating biological processes, including separate chromatic (chrom.) and achromatic (achrom.) processing. C The output is a sparse, 1,024 dimensional linear Kenyon Cell (KC) code from the visual projection neurons (VPN) formed from the anterior superior optic tract( asot), anterior inferior optic tract (aiot), and lateral optic tract (lot) -- representing the Top K activations from a given input photoreceptor.
  • Figure 2: Visual model training using self-supervised contrastive learning.A Example images from the Tiny ImageNet dataset Le2015, which consists of 100,000 64x64 pixel images from 200 different data classes. B Schematic example of the SimCLR training method Chen2020. Images from the Tiny ImageNet dataset are converted to blue:green images and have two separately random augmentations applied, which are then subsequently processed through the visual model (Fig. \ref{['fig:vizmodule']}), where finally an NT-Xent loss function is calculated to minimize the distance between the KC representations that $vision$ produces. C Example activations from a fully trained model, showing which visual features $vision$ responds strongest to.
  • Figure 3: The ANN version of produces low-dimensionality Kenyon cell representations.A Example activations of each layer of the vision model for two different flower inputs (lavender and sunflower), highlighting activations from the CNN. B Kenyon cell representations from the lavender (top) and sunflower (bottom) produces distinct activation patterns that allow them to be C separated and distinguished from one another reliably. The $KC$ has 1,024 output neurons, with each neuron index representing a single $KC$ neuron.
  • Figure 4: The SNN version of vision creates sparse Kenyon cell codes through spiking neuron dynamics.A Representative spiking activations from each layer of the SNN for a 75x75 pixel input of a lavender and sunflower, B showing varied sparse KC representations. C The SNN is capable of creating unique KC codes for individual flower classes measured through cosine similarity of KC features.
  • Figure 5: Temporal accumulation of information allows for complex image classification.A Example input image from the 17 Category Flower dataset Nilsback06 in its original resolution, with a scanning path of 75x75 pixel patches used to generate temporally accumulated information. B Comparison of linear classification performance of an artificial network in predicting the correct flower species from scanning activations. Raw image inputs to a linear classifier, without any visual processing, performed as well as the untrained vision model (n=8, p=0.989, One-way ANOVA with Tukey's HSD). Training over a single epoch produces accuracy over 70%, a 30% increase over untrained or raw inputs, with the peak of 76.6% after 3 training epochs (n=8, p$<$0.0001, One-way ANOVA with Tukey's HSD). C Cosine similarity matrices of raw pixel similarity and KC output for 17 species of flowers used in B.
  • ...and 2 more figures