Table of Contents
Fetching ...

NECOMIMI: Neural-Cognitive Multimodal EEG-informed Image Generation with Diffusion Models

Chi-Sheng Chen

TL;DR

A key discovery of this work is that the model tends to generate abstract or generalized images, such as landscapes, rather than specific objects, highlighting the inherent challenges of translating noisy and low-resolution EEG data into detailed visual outputs.

Abstract

NECOMIMI (NEural-COgnitive MultImodal EEG-Informed Image Generation with Diffusion Models) introduces a novel framework for generating images directly from EEG signals using advanced diffusion models. Unlike previous works that focused solely on EEG-image classification through contrastive learning, NECOMIMI extends this task to image generation. The proposed NERV EEG encoder demonstrates state-of-the-art (SoTA) performance across multiple zero-shot classification tasks, including 2-way, 4-way, and 200-way, and achieves top results in our newly proposed Category-based Assessment Table (CAT) Score, which evaluates the quality of EEG-generated images based on semantic concepts. A key discovery of this work is that the model tends to generate abstract or generalized images, such as landscapes, rather than specific objects, highlighting the inherent challenges of translating noisy and low-resolution EEG data into detailed visual outputs. Additionally, we introduce the CAT Score as a new metric tailored for EEG-to-image evaluation and establish a benchmark on the ThingsEEG dataset. This study underscores the potential of EEG-to-image generation while revealing the complexities and challenges that remain in bridging neural activity with visual representation.

NECOMIMI: Neural-Cognitive Multimodal EEG-informed Image Generation with Diffusion Models

TL;DR

A key discovery of this work is that the model tends to generate abstract or generalized images, such as landscapes, rather than specific objects, highlighting the inherent challenges of translating noisy and low-resolution EEG data into detailed visual outputs.

Abstract

NECOMIMI (NEural-COgnitive MultImodal EEG-Informed Image Generation with Diffusion Models) introduces a novel framework for generating images directly from EEG signals using advanced diffusion models. Unlike previous works that focused solely on EEG-image classification through contrastive learning, NECOMIMI extends this task to image generation. The proposed NERV EEG encoder demonstrates state-of-the-art (SoTA) performance across multiple zero-shot classification tasks, including 2-way, 4-way, and 200-way, and achieves top results in our newly proposed Category-based Assessment Table (CAT) Score, which evaluates the quality of EEG-generated images based on semantic concepts. A key discovery of this work is that the model tends to generate abstract or generalized images, such as landscapes, rather than specific objects, highlighting the inherent challenges of translating noisy and low-resolution EEG data into detailed visual outputs. Additionally, we introduce the CAT Score as a new metric tailored for EEG-to-image evaluation and establish a benchmark on the ThingsEEG dataset. This study underscores the potential of EEG-to-image generation while revealing the complexities and challenges that remain in bridging neural activity with visual representation.
Paper Structure (24 sections, 14 equations, 49 figures, 6 tables)

This paper contains 24 sections, 14 equations, 49 figures, 6 tables.

Figures (49)

  • Figure 1: This image demonstrates the capability of the NECOMIMI model to reconstruct images purely from EEG data without using the "Seen" images (ground truth) as embeddings during the generation process. The two-stage NECOMIMI architecture effectively extracts semantic information from noisy EEG signals, showing that it can capture and represent the underlying concepts from brainwave activity. The bottom row of images, generated solely from EEG input, highlights the potential of NECOMIMI to approximate the content of the "Seen" images in the top row, even in the absence of any direct visual reference or embedding.
  • Figure 2: The figure illustrates the entire workflow of the EEG-based image generation model.
  • Figure 3: This diagram shows the overall structure and workflow of the NERV EEG encoder model.
  • Figure 4: The image illustrates the progression of visual representations generated using different embedding techniques in a diffusion model: (a) Top row: The original images shown to subjects (ground truth). (b) Second row: Images generated by the CLIP-ViT embeddings of the original images. (c) Third row: Images generated by one-stage method using pure EEG embeddings with NERV EEG encoder. (d) Fourth row: Images generated by two-stage NECOMIMI method using pure EEG embeddings with NERV EEG encoder.
  • Figure 5: Random selected generated images in Subject 6 with NICE EEG encoder.
  • ...and 44 more figures