Table of Contents
Fetching ...

MetaSapiens: Real-Time Neural Rendering with Efficiency-Aware Pruning and Accelerated Foveated Rendering

Weikai Lin, Yu Feng, Yuhao Zhu

TL;DR

MetaSapiens tackles the challenge of real-time point-based neural rendering on mobile by combining efficiency-aware pruning, foveated rendering, and a co-designed accelerator. It introduces a CE-based pruning metric and a WS-driven scale decay to reduce tile-ellipse intersections, and a foveated PBNR pipeline with selective multi-versioning guided by HVSQ to preserve perceptual quality. A hardware accelerator with tile merging and incremental pipelining addresses FR load imbalance, delivering up to ~20x speedups and substantial energy savings over GPU baselines, with subjective quality matching state-of-the-art dense PBNR. The work demonstrates the viability of AR/VR-grade, photorealistic rendering on mobile devices and provides a complete hardware-software co-design framework for FR-based PBNR. This could significantly impact mobile AR/VR pipelines, digital twins, and real-time immersive graphics.

Abstract

Point-Based Neural Rendering (PBNR) is emerging as a promising class of rendering techniques, which are permeating all aspects of society, driven by a growing demand for real-time, photorealistic rendering in AR/VR and digital twins. Achieving real-time PBNR on mobile devices is challenging. This paper proposes MetaSapiens, a PBNR system that for the first time delivers real-time neural rendering on mobile devices while maintaining human visual quality. MetaSapiens combines three techniques. First, we present an efficiency-aware pruning technique to optimize rendering speed. Second, we introduce a Foveated Rendering (FR) method for PBNR, leveraging humans' low visual acuity in peripheral regions to relax rendering quality and improve rendering speed. Finally, we propose an accelerator design for FR, addressing the load imbalance issue in (FR-based) PBNR. Our evaluation shows that our system achieves an order of magnitude speedup over existing PBNR models without sacrificing subjective visual quality, as confirmed by a user study. The code and demo are available at: https://horizon-lab.org/metasapiens/.

MetaSapiens: Real-Time Neural Rendering with Efficiency-Aware Pruning and Accelerated Foveated Rendering

TL;DR

MetaSapiens tackles the challenge of real-time point-based neural rendering on mobile by combining efficiency-aware pruning, foveated rendering, and a co-designed accelerator. It introduces a CE-based pruning metric and a WS-driven scale decay to reduce tile-ellipse intersections, and a foveated PBNR pipeline with selective multi-versioning guided by HVSQ to preserve perceptual quality. A hardware accelerator with tile merging and incremental pipelining addresses FR load imbalance, delivering up to ~20x speedups and substantial energy savings over GPU baselines, with subjective quality matching state-of-the-art dense PBNR. The work demonstrates the viability of AR/VR-grade, photorealistic rendering on mobile devices and provides a complete hardware-software co-design framework for FR-based PBNR. This could significantly impact mobile AR/VR pipelines, digital twins, and real-time immersive graphics.

Abstract

Point-Based Neural Rendering (PBNR) is emerging as a promising class of rendering techniques, which are permeating all aspects of society, driven by a growing demand for real-time, photorealistic rendering in AR/VR and digital twins. Achieving real-time PBNR on mobile devices is challenging. This paper proposes MetaSapiens, a PBNR system that for the first time delivers real-time neural rendering on mobile devices while maintaining human visual quality. MetaSapiens combines three techniques. First, we present an efficiency-aware pruning technique to optimize rendering speed. Second, we introduce a Foveated Rendering (FR) method for PBNR, leveraging humans' low visual acuity in peripheral regions to relax rendering quality and improve rendering speed. Finally, we propose an accelerator design for FR, addressing the load imbalance issue in (FR-based) PBNR. Our evaluation shows that our system achieves an order of magnitude speedup over existing PBNR models without sacrificing subjective visual quality, as confirmed by a user study. The code and demo are available at: https://horizon-lab.org/metasapiens/.
Paper Structure (55 sections, 6 equations, 15 figures, 1 table)

This paper contains 55 sections, 6 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Illustration of PBNR, which parameterizes the scene with a set of points, each associated with a 3D Gaussian distribution that gives rise to an ellipsoid. The ellipsoids are projected to ellipses on the image plane, where the ellipses are sorted (per tile, e.g., $2\times 2$ pixels). The color of a pixel is calculated by integrating the contribution of each intersecting ellipse (e.g., a, c, d, e for p).
  • Figure 2: Pixels under the user's gaze have low eccentricities, where the human visual quality is the highest; the peripheral pixels have high eccentricities where human visual acuity is low. In peripheral regions, the visual stimulus (image) can be altered without being discriminable from the reference stimulus if the statistics of the image features are close, as quantified by the HVSQ metric (Eqn. \ref{['eq:hvsq']}). SP: spatial pooling.
  • Figure 3: FPS distribution of recent PBNR models on common datasets measured on mobile Volta GPU on Jetson Xavier.
  • Figure 4: Point count vs. latency per frame and the number of tile-ellipse intersections vs. latency per frame.
  • Figure 5: Two ellipses intersect different number of tiles so contribute to computation cost differently.
  • ...and 10 more figures