Orca: Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models
Zhe Li, Ronghui Xu, Jilin Hu, Zhong Peng, Xi Lu, Chenjuan Guo, Bin Yang
TL;DR
The paper tackles the challenge of estimating significant wave height (SWH) when buoy observations are sparse and traditional numerical models are computationally expensive. It introduces Orca, a framework that uses Large Language Models (LLMs) as the estimation backbone, augmented with a spatio-temporal encoding module, task-specific prompt design, and physics-guided regularization to predict grid-based SWH from limited buoy data. Key contributions include a 1) spatio-temporal aware encoding pipeline, 2) prompt templates and soft-prompt embeddings for LLM adaptation, 3) a spatial encoding based on Z-order mapping, and 4) empirical validation in the Gulf of Mexico showing state-of-the-art accuracy and substantial speedups over traditional approaches. The work enables accurate, efficient SWH forecasts under data scarcity, with practical implications for marine operations and safety.
Abstract
Significant wave height (SWH) is a vital metric in marine science, and accurate SWH estimation is crucial for various applications, e.g., marine energy development, fishery, early warning systems for potential risks, etc. Traditional SWH estimation methods that are based on numerical models and physical theories are hindered by computational inefficiencies. Recently, machine learning has emerged as an appealing alternative to improve accuracy and reduce computational time. However, due to limited observational technology and high costs, the scarcity of real-world data restricts the potential of machine learning models. To overcome these limitations, we propose an ocean SWH estimation framework, namely Orca. Specifically, Orca enhances the limited spatio-temporal reasoning abilities of classic LLMs with a novel spatiotemporal aware encoding module. By segmenting the limited buoy observational data temporally, encoding the buoys' locations spatially, and designing prompt templates, Orca capitalizes on the robust generalization ability of LLMs to estimate significant wave height effectively with limited data. Experimental results on the Gulf of Mexico demonstrate that Orca achieves state-of-the-art performance in SWH estimation.
