Table of Contents
Fetching ...

Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI

Xuan-Bac Nguyen, Xin Li, Pawan Sinha, Samee U. Khan, Khoa Luu

TL;DR

A novel framework named Brainformer is introduced, a straightforward yet effective Transformer-based framework to analyze Functional Magnetic Resonance Imaging (fMRI) patterns in the human perception system from a machine-learning perspective, to explore brain activity patterns through fMRI signals.

Abstract

Human perception plays a vital role in forming beliefs and understanding reality. A deeper understanding of brain functionality will lead to the development of novel deep neural networks. In this work, we introduce a novel framework named Brainformer, a straightforward yet effective Transformer-based framework, to analyze Functional Magnetic Resonance Imaging (fMRI) patterns in the human perception system from a machine-learning perspective. Specifically, we present the Multi-scale fMRI Transformer to explore brain activity patterns through fMRI signals. This architecture includes a simple yet efficient module for high-dimensional fMRI signal encoding and incorporates a novel embedding technique called 3D Voxels Embedding. Secondly, drawing inspiration from the functionality of the brain's Region of Interest, we introduce a novel loss function called Brain fMRI Guidance Loss. This loss function mimics brain activity patterns from these regions in the deep neural network using fMRI data. This work introduces a prospective approach to transferring knowledge from human perception to neural networks. Our experiments demonstrate that leveraging fMRI information allows the machine vision model to achieve results comparable to State-of-the-Art methods in various image recognition tasks.

Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI

TL;DR

A novel framework named Brainformer is introduced, a straightforward yet effective Transformer-based framework to analyze Functional Magnetic Resonance Imaging (fMRI) patterns in the human perception system from a machine-learning perspective, to explore brain activity patterns through fMRI signals.

Abstract

Human perception plays a vital role in forming beliefs and understanding reality. A deeper understanding of brain functionality will lead to the development of novel deep neural networks. In this work, we introduce a novel framework named Brainformer, a straightforward yet effective Transformer-based framework, to analyze Functional Magnetic Resonance Imaging (fMRI) patterns in the human perception system from a machine-learning perspective. Specifically, we present the Multi-scale fMRI Transformer to explore brain activity patterns through fMRI signals. This architecture includes a simple yet efficient module for high-dimensional fMRI signal encoding and incorporates a novel embedding technique called 3D Voxels Embedding. Secondly, drawing inspiration from the functionality of the brain's Region of Interest, we introduce a novel loss function called Brain fMRI Guidance Loss. This loss function mimics brain activity patterns from these regions in the deep neural network using fMRI data. This work introduces a prospective approach to transferring knowledge from human perception to neural networks. Our experiments demonstrate that leveraging fMRI information allows the machine vision model to achieve results comparable to State-of-the-Art methods in various image recognition tasks.
Paper Structure (20 sections, 6 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 20 sections, 6 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Given a pair of images and fMRI signals (x-axis is the voxel index, y-axis is the magnitude of the voxel response), Brainformer can explore the local patterns of fMRI signals from brain regions and discover their interactions. Best view in color
  • Figure 2: Brainformer utilizes fMRI signals (x-axis is the voxel index, y-axis is the magnitude of the voxel response) from specific brain regions as input, extracting the local features representing patterns within each region. The $\texttt{TransformerBlock}$ measures the correlation among these regions to emulate brain activities. This information is subsequently transferred to the vision model through Contrastive Loss and Brain fMRI Guidance Loss.
  • Figure 3: The details of Multi-scale fMRI Transformer module.
  • Figure 4: Two voxels, denoted $v_1 = (x_1, y_1, z_1)$ and $v_2 = (x_2, y_2, z_2)$ that are located closely in the 3D space of MRI, but in the fMRI signals (x-axis is the voxel index, y-axis is the magnitude of the voxel response), they are far awya from each other.
  • Figure 5: The circle and rectangle represent vision and fMRI features, respectively. Each color indicates a different object of interest that the human brain is processing. The Brain fMRI Guidance Loss aims to align visual and fMRI features of the same object while discriminating with features of other objects.
  • ...and 1 more figures