Table of Contents
Fetching ...

NeRF-NQA: No-Reference Quality Assessment for Scenes Generated by NeRF and Neural View Synthesis Methods

Qiang Qu, Hanxue Liang, Xiaoming Chen, Yuk Ying Chung, Yiran Shen

TL;DR

This work introduces NeRF-NQA, the first no-reference quality assessment framework designed for scenes generated by Neural View Synthesis and NeRF variants with dense viewpoints. It combines a viewwise module that captures spatial fidelity and inter-view consistency with a pointwise module that encodes angular quality through the Pointwise Normalized Spherical Gradient (PNSG) features, fused by an MLP to produce comprehensive quality scores. Across Fieldwork, LLFF, and Lab datasets, NeRF-NQA outperforms 23 baseline QA methods in RMSE, SRCC, PLCC, and OR, and demonstrates strong cross-dataset generalization across diverse NVS methods. Limitations include dependence on COLMAP-derived sparse points and focus on front-facing scenes; future work will broaden coverage to 360-degree content and additional NVS techniques. The method offers a practical, no-reference tool that aligns quality assessments more closely with human perception for densely-sampled NVS scenes.

Abstract

Neural View Synthesis (NVS) has demonstrated efficacy in generating high-fidelity dense viewpoint videos using a image set with sparse views. However, existing quality assessment methods like PSNR, SSIM, and LPIPS are not tailored for the scenes with dense viewpoints synthesized by NVS and NeRF variants, thus, they often fall short in capturing the perceptual quality, including spatial and angular aspects of NVS-synthesized scenes. Furthermore, the lack of dense ground truth views makes the full reference quality assessment on NVS-synthesized scenes challenging. For instance, datasets such as LLFF provide only sparse images, insufficient for complete full-reference assessments. To address the issues above, we propose NeRF-NQA, the first no-reference quality assessment method for densely-observed scenes synthesized from the NVS and NeRF variants. NeRF-NQA employs a joint quality assessment strategy, integrating both viewwise and pointwise approaches, to evaluate the quality of NVS-generated scenes. The viewwise approach assesses the spatial quality of each individual synthesized view and the overall inter-views consistency, while the pointwise approach focuses on the angular qualities of scene surface points and their compound inter-point quality. Extensive evaluations are conducted to compare NeRF-NQA with 23 mainstream visual quality assessment methods (from fields of image, video, and light-field assessment). The results demonstrate NeRF-NQA outperforms the existing assessment methods significantly and it shows substantial superiority on assessing NVS-synthesized scenes without references. An implementation of this paper are available at https://github.com/VincentQQu/NeRF-NQA.

NeRF-NQA: No-Reference Quality Assessment for Scenes Generated by NeRF and Neural View Synthesis Methods

TL;DR

This work introduces NeRF-NQA, the first no-reference quality assessment framework designed for scenes generated by Neural View Synthesis and NeRF variants with dense viewpoints. It combines a viewwise module that captures spatial fidelity and inter-view consistency with a pointwise module that encodes angular quality through the Pointwise Normalized Spherical Gradient (PNSG) features, fused by an MLP to produce comprehensive quality scores. Across Fieldwork, LLFF, and Lab datasets, NeRF-NQA outperforms 23 baseline QA methods in RMSE, SRCC, PLCC, and OR, and demonstrates strong cross-dataset generalization across diverse NVS methods. Limitations include dependence on COLMAP-derived sparse points and focus on front-facing scenes; future work will broaden coverage to 360-degree content and additional NVS techniques. The method offers a practical, no-reference tool that aligns quality assessments more closely with human perception for densely-sampled NVS scenes.

Abstract

Neural View Synthesis (NVS) has demonstrated efficacy in generating high-fidelity dense viewpoint videos using a image set with sparse views. However, existing quality assessment methods like PSNR, SSIM, and LPIPS are not tailored for the scenes with dense viewpoints synthesized by NVS and NeRF variants, thus, they often fall short in capturing the perceptual quality, including spatial and angular aspects of NVS-synthesized scenes. Furthermore, the lack of dense ground truth views makes the full reference quality assessment on NVS-synthesized scenes challenging. For instance, datasets such as LLFF provide only sparse images, insufficient for complete full-reference assessments. To address the issues above, we propose NeRF-NQA, the first no-reference quality assessment method for densely-observed scenes synthesized from the NVS and NeRF variants. NeRF-NQA employs a joint quality assessment strategy, integrating both viewwise and pointwise approaches, to evaluate the quality of NVS-generated scenes. The viewwise approach assesses the spatial quality of each individual synthesized view and the overall inter-views consistency, while the pointwise approach focuses on the angular qualities of scene surface points and their compound inter-point quality. Extensive evaluations are conducted to compare NeRF-NQA with 23 mainstream visual quality assessment methods (from fields of image, video, and light-field assessment). The results demonstrate NeRF-NQA outperforms the existing assessment methods significantly and it shows substantial superiority on assessing NVS-synthesized scenes without references. An implementation of this paper are available at https://github.com/VincentQQu/NeRF-NQA.

Paper Structure

This paper contains 18 sections, 3 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: NVS-generated scenes can be conceptualized from two perspectives: views (left) and points (right). From the perspective of views, a scene can be perceived as an ensemble of views originating from diverse viewpoints. From the perspective of points, a scene can be perceived as a collection of surface points where each surface point can be observed from multiple angles.
  • Figure 2: Overview of the Proposed NVS Quality Assessment Framework.
  • Figure 3: The Structure of the Viewwise Quality Assessment Module.
  • Figure 4: The Detailed Architecture of the Pointwise Quality Assessment Module.
  • Figure 5: Scatter plots illustrating the correlation between ground truth JOD and estimation made by the most widely used metrics for NVS (i.e., PSNR, SSIM, and LPIPS) and the proposed NeRF-NQA across the Fieldwork, LLFF, and Lab datasets. Distinct symbols and colors denote various scene. Each subfigure features a red line representing the ideal prediction trajectory (i.e., ground truth == metric estimation). Notably, proximity of data points to this red line signifies superior metric performance.
  • ...and 2 more figures