Hi-DREAM: Brain Inspired Hierarchical Diffusion for fMRI Reconstruction via ROI Encoder and visuAl Mapping

Guowei Zhang; Yun Zhao; Moein Khajehnejad; Adeel Razi; Levin Kuhlmann

Hi-DREAM: Brain Inspired Hierarchical Diffusion for fMRI Reconstruction via ROI Encoder and visuAl Mapping

Guowei Zhang, Yun Zhao, Moein Khajehnejad, Adeel Razi, Levin Kuhlmann

TL;DR

The paper addresses the challenge of reconstructing natural images from fMRI by incorporating cortical hierarchy into diffusion-based decoders. It introduces Hi-DREAM, which uses an ROI adapter to form early/mid/late streams and a multi-scale cortical pyramid aligned with U-Net depths, with a depth-matched ROI-ControlNet for selective conditioning. On the NSD dataset, Hi-DREAM achieves state-of-the-art performance on high-level semantic metrics while maintaining competitive low-level fidelity, and ablation studies reveal distinct roles for early, middle, and late ROIs. This work offers a neuroanatomically grounded, interpretable alternative to global embeddings and highlights how structured conditioning can advance brain-to-image generation and provide neuroscientific insights.

Abstract

Mapping human brain activity to natural images offers a new window into vision and cognition, yet current diffusion-based decoders face a core difficulty: most condition directly on fMRI features without analyzing how visual information is organized across the cortex. This overlooks the brain's hierarchical processing and blurs the roles of early, middle, and late visual areas. We propose Hi-DREAM, a brain-inspired conditional diffusion framework that makes the cortical organization explicit. A region-of-interest (ROI) adapter groups fMRI into early/mid/late streams and converts them into a multi-scale cortical pyramid aligned with the U-Net depth (shallow scales preserve layout and edges; deeper scales emphasize objects and semantics). A lightweight, depth-matched ControlNet injects these scale-specific hints during denoising. The result is an efficient and interpretable decoder in which each signal plays a brain-like role, allowing the model not only to reconstruct images but also to illuminate functional contributions of different visual areas. Experiments on the Natural Scenes Dataset (NSD) show that Hi-DREAM attains state-of-the-art performance on high-level semantic metrics while maintaining competitive low-level fidelity. These findings suggest that structuring conditioning by cortical hierarchy is a powerful alternative to purely data-driven embeddings and provides a useful lens for studying the visual cortex.

Hi-DREAM: Brain Inspired Hierarchical Diffusion for fMRI Reconstruction via ROI Encoder and visuAl Mapping

TL;DR

Abstract

Hi-DREAM: Brain Inspired Hierarchical Diffusion for fMRI Reconstruction via ROI Encoder and visuAl Mapping

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)