Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks
Marc Rußwurm, Konstantin Klemmer, Esther Rolf, Robin Zbinden, Devis Tuia
TL;DR
This paper tackles the challenge of encoding global geographic coordinates in machine learning by replacing rectangular-domain DFS-based embeddings with spherical harmonic (SH) basis functions, which are defined on the sphere and handle polar regions naturally. It further shows that Sinusoidal Representation Networks (SirenNets) can be interpreted as learned DFS embeddings, enabling a unified, learnable approach to coordinate encoding. Through extensive experiments on synthetic (checkerboard) and real-world datasets (ERA5 climate data, land-ocean, and iNaturalist), SH embeddings prove robust across models, with SH+SirenNet combinations achieving state-of-the-art performance in multiple tasks, and SH alone performing competitively even with simple linear mappings. The work provides practical guidance for location encoding in geospatial ML, advocating SH for global-scale and polar-sensitive problems and SirenNets for versatile, high-capacity representations, with broad implications for geospatial modeling and implicit neural representations on spheres.
Abstract
Learning representations of geographical space is vital for any machine learning model that integrates geolocated data, spanning application domains such as remote sensing, ecology, or epidemiology. Recent work embeds coordinates using sine and cosine projections based on Double Fourier Sphere (DFS) features. These embeddings assume a rectangular data domain even on global data, which can lead to artifacts, especially at the poles. At the same time, little attention has been paid to the exact design of the neural network architectures with which these functional embeddings are combined. This work proposes a novel location encoder for globally distributed geographic data that combines spherical harmonic basis functions, natively defined on spherical surfaces, with sinusoidal representation networks (SirenNets) that can be interpreted as learned Double Fourier Sphere embedding. We systematically evaluate positional embeddings and neural network architectures across various benchmarks and synthetic evaluation datasets. In contrast to previous approaches that require the combination of both positional encoding and neural networks to learn meaningful representations, we show that both spherical harmonics and sinusoidal representation networks are competitive on their own but set state-of-the-art performances across tasks when combined. The model code and experiments are available at https://github.com/marccoru/locationencoder.
