Table of Contents
Fetching ...

MS-Glance: Bio-Insipred Non-semantic Context Vectors and their Applications in Supervising Image Reconstruction

Ziqi Gao, Wendi Yang, Yujia Li, Lei Xing, S. Kevin Zhou

TL;DR

A biologically informed non-semantic context descriptor, MS-Glance, is proposed, along with the Glance Index Measure, along with the effectiveness of incorporating Glance supervision in two reconstruction tasks: image fitting with implicit neural representation (INR) and undersampled MRI reconstruction.

Abstract

Non-semantic context information is crucial for visual recognition, as the human visual perception system first uses global statistics to process scenes rapidly before identifying specific objects. However, while semantic information is increasingly incorporated into computer vision tasks such as image reconstruction, non-semantic information, such as global spatial structures, is often overlooked. To bridge the gap, we propose a biologically informed non-semantic context descriptor, \textbf{MS-Glance}, along with the Glance Index Measure for comparing two images. A Global Glance vector is formulated by randomly retrieving pixels based on a perception-driven rule from an image to form a vector representing non-semantic global context, while a local Glance vector is a flattened local image window, mimicking a zoom-in observation. The Glance Index is defined as the inner product of two standardized sets of Glance vectors. We evaluate the effectiveness of incorporating Glance supervision in two reconstruction tasks: image fitting with implicit neural representation (INR) and undersampled MRI reconstruction. Extensive experimental results show that MS-Glance outperforms existing image restoration losses across both natural and medical images. The code is available at \url{https://github.com/Z7Gao/MSGlance}.

MS-Glance: Bio-Insipred Non-semantic Context Vectors and their Applications in Supervising Image Reconstruction

TL;DR

A biologically informed non-semantic context descriptor, MS-Glance, is proposed, along with the Glance Index Measure, along with the effectiveness of incorporating Glance supervision in two reconstruction tasks: image fitting with implicit neural representation (INR) and undersampled MRI reconstruction.

Abstract

Non-semantic context information is crucial for visual recognition, as the human visual perception system first uses global statistics to process scenes rapidly before identifying specific objects. However, while semantic information is increasingly incorporated into computer vision tasks such as image reconstruction, non-semantic information, such as global spatial structures, is often overlooked. To bridge the gap, we propose a biologically informed non-semantic context descriptor, \textbf{MS-Glance}, along with the Glance Index Measure for comparing two images. A Global Glance vector is formulated by randomly retrieving pixels based on a perception-driven rule from an image to form a vector representing non-semantic global context, while a local Glance vector is a flattened local image window, mimicking a zoom-in observation. The Glance Index is defined as the inner product of two standardized sets of Glance vectors. We evaluate the effectiveness of incorporating Glance supervision in two reconstruction tasks: image fitting with implicit neural representation (INR) and undersampled MRI reconstruction. Extensive experimental results show that MS-Glance outperforms existing image restoration losses across both natural and medical images. The code is available at \url{https://github.com/Z7Gao/MSGlance}.

Paper Structure

This paper contains 24 sections, 13 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Category of human recognitionOliva2001ModelingSpatialEnvelope and the extraction of multi-scale Glance Vectors.
  • Figure 2: Leverage the MRI air prior with an intensity threshold.
  • Figure 3: Step-wise SIREN reconstruction performance with various loss functions on a classic RGB image. The images on the right provide a zoomed-in view of the last 150 steps.
  • Figure 4: Qualitative results of SIREN reconstruction with various loss functions on Coco, CelebA, and IXI datasets. The odd rows show the reconstructed images and the even rows show the corresponding error maps, which are computed as the mean absolute value between the reconstructed image and the ground truth (GT). Error maps are normalized per row for better visualization.
  • Figure 5: Conventional loss functions for training undersampled MRI reconstruction networks compared with our MS-Glance. The reconstructed images are evaluated with PSNR and SSIM and are zoomed in at the red region, which contains the cortex structure.