Table of Contents
Fetching ...

Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention

Jiebin Yan, Ziwen Tan, Yuming Fang, Jiale Rao, Yifan Zuo

TL;DR

This work tackles blind omnidirectional image quality assessment (BOIQA) for 360-degree images, especially under non-uniform distortion. It introduces Max360IQ, which combines stacked multi-axis attention, a multi-scale feature integration (MSFI) module, and a Deep Semantic Guidance (DSG) based regression, with optional GRUs to model temporal viewing of viewports. The approach unifies non-uniform and uniform distortion handling via viewport-based feature extraction, GeM-based multi-scale pooling, and temporal regression to produce per-viewport and per-image quality scores. Empirical results show Max360IQ surpasses Assessor360 by up to 3.6% SRCC on JUFE (non-uniform distortions) and achieves notable gains on uniformly distorted datasets OIQA and CVIQ, demonstrating practical impact for automatic quality assessment and content selection in immersive media; code is publicly available at the project repository.

Abstract

Omnidirectional image, also called 360-degree image, is able to capture the entire 360-degree scene, thereby providing more realistic immersive feelings for users than general 2D image and stereoscopic image. Meanwhile, this feature brings great challenges to measuring the perceptual quality of omnidirectional images, which is closely related to users' quality of experience, especially when the omnidirectional images suffer from non-uniform distortion. In this paper, we propose a novel and effective blind omnidirectional image quality assessment (BOIQA) model with multi-axis attention (Max360IQ), which can proficiently measure not only the quality of uniformly distorted omnidirectional images but also the quality of non-uniformly distorted omnidirectional images. Specifically, the proposed Max360IQ is mainly composed of a backbone with stacked multi-axis attention modules for capturing both global and local spatial interactions of extracted viewports, a multi-scale feature integration (MSFI) module to fuse multi-scale features and a quality regression module with deep semantic guidance for predicting the quality of omnidirectional images. Experimental results demonstrate that the proposed Max360IQ outperforms the state-of-the-art Assessor360 by 3.6\% in terms of SRCC on the JUFE database with non-uniform distortion, and gains improvement of 0.4\% and 0.8\% in terms of SRCC on the OIQA and CVIQ databases, respectively. The source code is available at https://github.com/WenJuing/Max360IQ.

Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention

TL;DR

This work tackles blind omnidirectional image quality assessment (BOIQA) for 360-degree images, especially under non-uniform distortion. It introduces Max360IQ, which combines stacked multi-axis attention, a multi-scale feature integration (MSFI) module, and a Deep Semantic Guidance (DSG) based regression, with optional GRUs to model temporal viewing of viewports. The approach unifies non-uniform and uniform distortion handling via viewport-based feature extraction, GeM-based multi-scale pooling, and temporal regression to produce per-viewport and per-image quality scores. Empirical results show Max360IQ surpasses Assessor360 by up to 3.6% SRCC on JUFE (non-uniform distortions) and achieves notable gains on uniformly distorted datasets OIQA and CVIQ, demonstrating practical impact for automatic quality assessment and content selection in immersive media; code is publicly available at the project repository.

Abstract

Omnidirectional image, also called 360-degree image, is able to capture the entire 360-degree scene, thereby providing more realistic immersive feelings for users than general 2D image and stereoscopic image. Meanwhile, this feature brings great challenges to measuring the perceptual quality of omnidirectional images, which is closely related to users' quality of experience, especially when the omnidirectional images suffer from non-uniform distortion. In this paper, we propose a novel and effective blind omnidirectional image quality assessment (BOIQA) model with multi-axis attention (Max360IQ), which can proficiently measure not only the quality of uniformly distorted omnidirectional images but also the quality of non-uniformly distorted omnidirectional images. Specifically, the proposed Max360IQ is mainly composed of a backbone with stacked multi-axis attention modules for capturing both global and local spatial interactions of extracted viewports, a multi-scale feature integration (MSFI) module to fuse multi-scale features and a quality regression module with deep semantic guidance for predicting the quality of omnidirectional images. Experimental results demonstrate that the proposed Max360IQ outperforms the state-of-the-art Assessor360 by 3.6\% in terms of SRCC on the JUFE database with non-uniform distortion, and gains improvement of 0.4\% and 0.8\% in terms of SRCC on the OIQA and CVIQ databases, respectively. The source code is available at https://github.com/WenJuing/Max360IQ.

Paper Structure

This paper contains 16 sections, 19 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The visual examples of uniformly distorted OI and non-uniformly distorted OI. The top row shows an OI suffered from uniform distortion from the CVIQ database sun2018large, while the bottom row shows an OI with non-uniform distortion from the JUFE database fang2022perceptual.
  • Figure 2: The architecture of our proposed Max360IQ. It mainly consists of three parts: a backbone, a multi-scale feature integration (MSFI) module, and a quality regression (QR) module. Note that the GRUs component in Max360IQ is optional for optimal performance in different scenarios, i.e., non-uniformly and uniformly distorted OIs.
  • Figure 3: The illustration of the viewport extraction of four conditions. Note that the $SP$ denotes the starting point of scanpath and the region marked in red denotes low quality.
  • Figure 4: The scatter plots of predictions by some methods against the subjective scores on the JUFE database. Note that the scatter plot for each color represents the predicted scores under different conditions.
  • Figure 5: The influence of the number of viewports on the performance of the proposed Max360IQ.
  • ...and 1 more figures