Table of Contents
Fetching ...

From Cluster to Desktop: A Cache-Accelerated INR framework for Interactive Visualization of Tera-Scale Data

Daniel Zavorotny, Qi Wu, David Bauer, Kwan-Liu Ma

TL;DR

The paper addresses the challenge of visualizing tera-scale scientific data with implicit neural representations by introducing a cache-accelerated INR rendering framework that combines compressed INRs with a scalable MRPD GPU cache. The approach adds a saliency-based priority scheduler and asynchronous brick loading to minimize per-frame INR inferences, enabling interactive visualization on consumer hardware. Key contributions include a full pipeline integrating INR compression with MRPD caching, a bricks-based cache architecture with priority ranking and miss handling, and extensive evaluation demonstrating around a fivefold speedup for ray marching and substantial scalability on large datasets. This work broadens the practical use of INRs for high-performance scientific visualization and lays groundwork for integrating neural representations into extreme-scale visualization workflows.

Abstract

Machine learning has enabled the use of implicit neural representations (INRs) to efficiently compress and reconstruct massive scientific datasets. However, despite advances in fast INR rendering algorithms, INR-based rendering remains computationally expensive, as computing data values from an INR is significantly slower than reading them from GPU memory. This bottleneck currently restricts interactive INR visualization to professional workstations. To address this challenge, we introduce an INR rendering framework accelerated by a scalable, multi-resolution GPU cache capable of efficiently representing tera-scale datasets. By minimizing redundant data queries and prioritizing novel volume regions, our method reduces the number of INR computations per frame, achieving an average 5x speedup over the state-of-the-art INR rendering method while still maintaining high visualization quality. Coupled with existing hardware-accelerated INR compressors, our framework enables scientists to generate and compress massive datasets in situ on high-performance computing platforms and then interactively explore them on consumer-grade hardware post hoc.

From Cluster to Desktop: A Cache-Accelerated INR framework for Interactive Visualization of Tera-Scale Data

TL;DR

The paper addresses the challenge of visualizing tera-scale scientific data with implicit neural representations by introducing a cache-accelerated INR rendering framework that combines compressed INRs with a scalable MRPD GPU cache. The approach adds a saliency-based priority scheduler and asynchronous brick loading to minimize per-frame INR inferences, enabling interactive visualization on consumer hardware. Key contributions include a full pipeline integrating INR compression with MRPD caching, a bricks-based cache architecture with priority ranking and miss handling, and extensive evaluation demonstrating around a fivefold speedup for ray marching and substantial scalability on large datasets. This work broadens the practical use of INRs for high-performance scientific visualization and lays groundwork for integrating neural representations into extreme-scale visualization workflows.

Abstract

Machine learning has enabled the use of implicit neural representations (INRs) to efficiently compress and reconstruct massive scientific datasets. However, despite advances in fast INR rendering algorithms, INR-based rendering remains computationally expensive, as computing data values from an INR is significantly slower than reading them from GPU memory. This bottleneck currently restricts interactive INR visualization to professional workstations. To address this challenge, we introduce an INR rendering framework accelerated by a scalable, multi-resolution GPU cache capable of efficiently representing tera-scale datasets. By minimizing redundant data queries and prioritizing novel volume regions, our method reduces the number of INR computations per frame, achieving an average 5x speedup over the state-of-the-art INR rendering method while still maintaining high visualization quality. Coupled with existing hardware-accelerated INR compressors, our framework enables scientists to generate and compress massive datasets in situ on high-performance computing platforms and then interactively explore them on consumer-grade hardware post hoc.

Paper Structure

This paper contains 25 sections, 8 figures, 1 table, 2 algorithms.

Figures (8)

  • Figure 1: After generating our macro-cell structure and compressed INR (0), our wavefront renderer utilizes the sampler interface (1) to request voxels from the MRPD cache manager (2). Missing brick ids are sent to the request handler in small batches (3) where they are scheduled for individual loading on the GPU (4) before being transferred into the data cache. The page table hierarchy and LRUs are then updated accordingly (5). Finally, missing voxels are inferred from the INR by the application (6) at the end of each sampling step.
  • Figure 2: An overview of the model architecture and hash-grid encoder. A vector of size $n$ is constructed from each resolution grid via interpolation of nearby grid points. The resulting feature vector thus encodes a multi-resolution representation of the input, allowing for a smaller MLP.
  • Figure 3: (Top) Performance comparison with our priority ranking enabled/disabled. (Bottom) Shows a timeline of the cache content after rendering without LoD pre-loading and fallback network calls on cache misses at $250$, $500$, $1000$, and $2000$ frames respectively. We see that ranking enables a more context aware representation of the data in the cache.
  • Figure 4: Image quality comparison of our multi-resolution pipeline, see "Cached" columns, with Wu et al.'s Wu:10175377 single resolution INR pipeline as our ground truth. The leftmost column of each comparison depicts pixel differences visually using FLIP flip, lighter is better. Additionally, we calculate PSNR, MSSIM, and LPIPs quality metrics based on both rendered images for each dataset. Results show that our cached pipeline maintains decent reconstruction quality while achieving the performance results shown in \ref{['tab:1']}.
  • Figure 5: FPS measured each frame over the course of our testing for \ref{['tab:1']}. (Top) Results with our cache disabled and sampling directly from the INR. (Middle) Results after enabling our cache without pre-loading higher LoDs. (Bottom) Results with pre-loading enabled. We see that pre-loading greatly improves performance during the initial frames and allows the FPS to stabilize quicker on the more challenging datasets.
  • ...and 3 more figures