Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior
Haitao Wu, Qing Li, Changqing Zhang, Zhen He, Xiaomin Ying
TL;DR
This work tackles the gap between visual stimuli and brain signals by identifying two GAPs: System GAP (loss of high-frequency details) and Random GAP (dynamic perceptual/cognitive processes and noise). It introduces Uncertainty-aware Blur Prior (UBP), blending Gaussian blur and uncertainty-driven radius adjustments to reduce mismatch and improve cross-modal alignment in a vision-brain contrastive learning setup. Using RSVP-based THINGS-EEG/THINGS-MEG data and CLIP-based visual encoders, UBP achieves state-of-the-art zero-shot brain-to-image retrieval, with notable improvements in Top-1 and Top-5 accuracies and robustness to subject variability. The approach offers a practical avenue for more reliable brain-computer interfaces and motivates uncertainty-aware priors for broader multimodal learning tasks.
Abstract
Can our brain signals faithfully reflect the original visual stimuli, even including high-frequency details? Although human perceptual and cognitive capacities enable us to process and remember visual information, these abilities are constrained by several factors, such as limited attentional resources and the finite capacity of visual memory. When visual stimuli are processed by human visual system into brain signals, some information is inevitably lost, leading to a discrepancy known as the \textbf{System GAP}. Additionally, perceptual and cognitive dynamics, along with technical noise in signal acquisition, degrade the fidelity of brain signals relative to the visual stimuli, known as the \textbf{Random GAP}. When encoded brain representations are directly aligned with the corresponding pretrained image features, the System GAP and Random GAP between paired data challenge the model, requiring it to bridge these gaps. However, in the context of limited paired data, these gaps are difficult for the model to learn, leading to overfitting and poor generalization to new data. To address these GAPs, we propose a simple yet effective approach called the \textbf{Uncertainty-aware Blur Prior (UBP)}. It estimates the uncertainty within the paired data, reflecting the mismatch between brain signals and visual stimuli. Based on this uncertainty, UBP dynamically blurs the high-frequency details of the original images, reducing the impact of the mismatch and improving alignment. Our method achieves a top-1 accuracy of \textbf{50.9\%} and a top-5 accuracy of \textbf{79.7\%} on the zero-shot brain-to-image retrieval task, surpassing previous state-of-the-art methods by margins of \textbf{13.7\%} and \textbf{9.8\%}, respectively. Code is available at \href{https://github.com/HaitaoWuTJU/Uncertainty-aware-Blur-Prior}{GitHub}.
