Table of Contents
Fetching ...

SynSacc: A Blender-to-V2E Pipeline for Synthetic Neuromorphic Eye-Movement Data and Sim-to-Real Spiking Model Training

Khadija Iddrisu, Waseem Shariff, Suzanne Little, Noel OConnor

TL;DR

SynSacc introduces a Blender-to-V2E pipeline to generate controlled synthetic eye-movement data (saccades and fixations) and convert them into event streams for neuromorphic training. It compares DenseSNN and ConvSNN architectures using binary spike representations with rate coding, achieving up to 0.83 accuracy and demonstrating robustness across temporal resolutions, while highlighting substantial computational efficiency gains over ANN counterparts. The study validates synthetic-to-real transfer by fine-tuning on a real event dataset (EV-Eye), showing that synthetic pretraining reduces the amount of real data required for competitive performance. Collectively, the work demonstrates the practicality of synthetic event-based data to pretrain and deploy energy-efficient SNNs for fine-grained eye-movement classification in resource-constrained scenarios.

Abstract

The study of eye movements, particularly saccades and fixations, are fundamental to understanding the mechanisms of human cognition and perception. Accurate classification of these movements requires sensing technologies capable of capturing rapid dynamics without distortion. Event cameras, also known as Dynamic Vision Sensors (DVS), provide asynchronous recordings of changes in light intensity, thereby eliminating motion blur inherent in conventional frame-based cameras and offering superior temporal resolution and data efficiency. In this study, we introduce a synthetic dataset generated with Blender to simulate saccades and fixations under controlled conditions. Leveraging Spiking Neural Networks (SNNs), we evaluate its robustness by training two architectures and finetuning on real event data. The proposed models achieve up to 0.83 accuracy and maintain consistent performance across varying temporal resolutions, demonstrating stability in eye movement classification. Moreover, the use of SNNs with synthetic event streams yields substantial computational efficiency gains over artificial neural network (ANN) counterparts, underscoring the utility of synthetic data augmentation in advancing event-based vision. All code and datasets associated with this work is available at https: //github.com/Ikhadija-5/SynSacc-Dataset.

SynSacc: A Blender-to-V2E Pipeline for Synthetic Neuromorphic Eye-Movement Data and Sim-to-Real Spiking Model Training

TL;DR

SynSacc introduces a Blender-to-V2E pipeline to generate controlled synthetic eye-movement data (saccades and fixations) and convert them into event streams for neuromorphic training. It compares DenseSNN and ConvSNN architectures using binary spike representations with rate coding, achieving up to 0.83 accuracy and demonstrating robustness across temporal resolutions, while highlighting substantial computational efficiency gains over ANN counterparts. The study validates synthetic-to-real transfer by fine-tuning on a real event dataset (EV-Eye), showing that synthetic pretraining reduces the amount of real data required for competitive performance. Collectively, the work demonstrates the practicality of synthetic event-based data to pretrain and deploy energy-efficient SNNs for fine-grained eye-movement classification in resource-constrained scenarios.

Abstract

The study of eye movements, particularly saccades and fixations, are fundamental to understanding the mechanisms of human cognition and perception. Accurate classification of these movements requires sensing technologies capable of capturing rapid dynamics without distortion. Event cameras, also known as Dynamic Vision Sensors (DVS), provide asynchronous recordings of changes in light intensity, thereby eliminating motion blur inherent in conventional frame-based cameras and offering superior temporal resolution and data efficiency. In this study, we introduce a synthetic dataset generated with Blender to simulate saccades and fixations under controlled conditions. Leveraging Spiking Neural Networks (SNNs), we evaluate its robustness by training two architectures and finetuning on real event data. The proposed models achieve up to 0.83 accuracy and maintain consistent performance across varying temporal resolutions, demonstrating stability in eye movement classification. Moreover, the use of SNNs with synthetic event streams yields substantial computational efficiency gains over artificial neural network (ANN) counterparts, underscoring the utility of synthetic data augmentation in advancing event-based vision. All code and datasets associated with this work is available at https: //github.com/Ikhadija-5/SynSacc-Dataset.
Paper Structure (26 sections, 15 equations, 2 figures, 6 tables, 1 algorithm)

This paper contains 26 sections, 15 equations, 2 figures, 6 tables, 1 algorithm.

Figures (2)

  • Figure 1: Overview of the data generation and processing pipeline: The workflow starts with a Blender 3D model, where camera and lighting settings are configured for realistic rendering. Textures, including detailed iris textures, are then applied to the model. The eyes are cropped from the rendered images to create focused sequences, which are subsequently converted into videos. These RGB sequences are transformed into event-based streams using the v2e framework. Finally, the event streams are represented as LIF binary spikes, which are used to train and predict with a spiking neural network (SNN).
  • Figure 2: Visualization of a Blender-rendered saccade sequence in frames (top row) and corresponding event stream frame representation visualized as frames (bottom row).