Table of Contents
Fetching ...

cuHPX: GPU-Accelerated Differentiable Spherical Harmonic Transforms on HEALPix Grids

Xiaopo Cheng, Akshay Subramaniam, Shixun Wu, Noah Brenowitz

TL;DR

cuHPX delivers a GPU-accelerated, differentiable framework for spherical harmonic transforms on HEALPix grids, addressing the challenges of irregular pixel geometry and large-scale spherical data. It introduces a Bluestein-based kernel fusion, an xyf intermediate layout for efficient data remapping, adjoint-based differentiability for backpropagation, and out-of-core memory strategies for Legendre transforms, achieving substantial speedups while maintaining accuracy. The approach enables seamless regridding between HEALPix and equiangular grids and supports batch processing, making it suitable for climate modeling, astrophysics, and machine learning workflows. Overall, cuHPX couples high-performance GPU kernels with differentiable programming to enable scalable, accurate, and interoperable spherical data analysis.

Abstract

HEALPix (Hierarchical Equal Area isoLatitude Pixelization) is a widely adopted spherical grid system in astrophysics, cosmology, and Earth sciences. Its equal-area, iso-latitude structure makes it particularly well-suited for large-scale data analysis on the sphere. However, implementing high-performance spherical harmonic transforms (SHTs) on HEALPix grids remains challenging due to irregular pixel geometry, latitude-dependent alignments, and the demands for high-resolution transforms at scale. In this work, we present cuHPX, an optimized CUDA library that provides functionality for spherical harmonic analysis and related utilities on HEALPix grids. Beyond delivering substantial performance improvements, cuHPX ensures high numerical accuracy, analytic gradients for integration with deep learning frameworks, out-of-core memory-efficient optimization, and flexible regridding between HEALPix, equiangular, and other common spherical grid formats. Through evaluation, we show that cuHPX achieves rapid spectral convergence and delivers over 20 times speedup compared to existing libraries, while maintaining numerical consistency. By combining accuracy, scalability, and differentiability, cuHPX enables a broad range of applications in climate science, astrophysics, and machine learning, effectively bridging optimized GPU kernels with scientific workflows.

cuHPX: GPU-Accelerated Differentiable Spherical Harmonic Transforms on HEALPix Grids

TL;DR

cuHPX delivers a GPU-accelerated, differentiable framework for spherical harmonic transforms on HEALPix grids, addressing the challenges of irregular pixel geometry and large-scale spherical data. It introduces a Bluestein-based kernel fusion, an xyf intermediate layout for efficient data remapping, adjoint-based differentiability for backpropagation, and out-of-core memory strategies for Legendre transforms, achieving substantial speedups while maintaining accuracy. The approach enables seamless regridding between HEALPix and equiangular grids and supports batch processing, making it suitable for climate modeling, astrophysics, and machine learning workflows. Overall, cuHPX couples high-performance GPU kernels with differentiable programming to enable scalable, accurate, and interoperable spherical data analysis.

Abstract

HEALPix (Hierarchical Equal Area isoLatitude Pixelization) is a widely adopted spherical grid system in astrophysics, cosmology, and Earth sciences. Its equal-area, iso-latitude structure makes it particularly well-suited for large-scale data analysis on the sphere. However, implementing high-performance spherical harmonic transforms (SHTs) on HEALPix grids remains challenging due to irregular pixel geometry, latitude-dependent alignments, and the demands for high-resolution transforms at scale. In this work, we present cuHPX, an optimized CUDA library that provides functionality for spherical harmonic analysis and related utilities on HEALPix grids. Beyond delivering substantial performance improvements, cuHPX ensures high numerical accuracy, analytic gradients for integration with deep learning frameworks, out-of-core memory-efficient optimization, and flexible regridding between HEALPix, equiangular, and other common spherical grid formats. Through evaluation, we show that cuHPX achieves rapid spectral convergence and delivers over 20 times speedup compared to existing libraries, while maintaining numerical consistency. By combining accuracy, scalability, and differentiability, cuHPX enables a broad range of applications in climate science, astrophysics, and machine learning, effectively bridging optimized GPU kernels with scientific workflows.

Paper Structure

This paper contains 24 sections, 29 equations, 11 figures.

Figures (11)

  • Figure 1: Visualization of (left) an equiangular latitude/longitude grid and (right) a HEALPix grid on (A) a sphere and (B) the latitude $\theta$ and longitude $\phi$ plane.
  • Figure 2: (A) Schematic diagram of hierarchical subdivision of HEALPix faces on a sphere. (B) Iso-latitude distribution of HEALPix Pixels on the latitude/longitude plane
  • Figure 3: Schematic diagrams of (A) RING, (B) NESTED, (C) Flat indexing layout on HEALPix grids. Blue indicates counterclockwise ordering from the south, while pink indicates clockwise ordering from the east. (D) Data reordering on the HEALPix grids.
  • Figure 4: (A) Reordering paths between HEALPix indexing layouts via the xyf intermediate representation. (B) Illustration of the tuple $(x,y,f)$, where $x$ and $y$ denote local coordinates in a base quadrilateral, and $f$ identifies the face index.
  • Figure 5: Efficient decomposition of spherical harmonic transforms on HEALPix. A real FFT (rFFT) is first performed in longitude ($\phi$) for each ring, followed by phase alignment, zero padding, and a Legendre transform in latitude ($\theta$).
  • ...and 6 more figures