ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images
Weiqi Li, Shijie Zhao, Bin Chen, Xinhua Cheng, Junlin Li, Li Zhang, Jian Zhang
TL;DR
ResVR addresses the gap where previous ODI rescaling methods optimize ERP image quality but neglect the actual viewport that users see on HMDs. It jointly learns downscaling and viewport rendering by introducing a discrete pixel sampling strategy and a spherical pixel shape representation, enabling end-to-end training from LR ERP to HR viewport; the VR module uses an implicit neural representation approach to render viewports directly from LR-ERP features. The method achieves state-of-the-art viewport quality across different fields of view, resolutions, and view directions while maintaining low transmission bitrate, demonstrating the practical impact of end-to-end ODI processing. This framework has potential to significantly reduce bandwidth for VR streaming services without sacrificing user-perceived image quality.
Abstract
With the advent of virtual reality technology, omnidirectional image (ODI) rescaling techniques are increasingly embraced for reducing transmitted and stored file sizes while preserving high image quality. Despite this progress, current ODI rescaling methods predominantly focus on enhancing the quality of images in equirectangular projection (ERP) format, which overlooks the fact that the content viewed on head mounted displays (HMDs) is actually a rendered viewport instead of an ERP image. In this work, we emphasize that focusing solely on ERP quality results in inferior viewport visual experiences for users. Thus, we propose ResVR, which is the first comprehensive framework for the joint Rescaling and Viewport Rendering of ODIs. ResVR allows obtaining LR ERP images for transmission while rendering high-quality viewports for users to watch on HMDs. In our ResVR, a novel discrete pixel sampling strategy is developed to tackle the complex mapping between the viewport and ERP, enabling end-to-end training of ResVR pipeline. Furthermore, a spherical pixel shape representation technique is innovatively derived from spherical differentiation to significantly improve the visual quality of rendered viewports. Extensive experiments demonstrate that our ResVR outperforms existing methods in viewport rendering tasks across different fields of view, resolutions, and view directions while keeping a low transmission overhead.
