WeatherFormer: A Pretrained Encoder Model for Learning Robust Weather Representations from Small Datasets
Adib Hasan, Mardavij Roozbehani, Munther Dahleh
TL;DR
WeatherFormer introduces a pretrained weather encoder that learns robust representations from a large satellite-based pretraining corpus, enabling effective learning on small downstream datasets. The model uses a novel spatiotemporal positional encoding and a pretraining task that predicts masked weather variables, improving generalization to tasks with limited observations. In finetuning experiments, WeatherFormer achieves state-of-the-art performance in county-level soybean yield prediction and influenza forecasting in NYC, demonstrating transfer to agriculture and epidemiology domains. The work suggests pretrained weather encoders can unlock performance gains across weather-dependent applications and motivates yearly retraining with updated data.
Abstract
This paper introduces WeatherFormer, a transformer encoder-based model designed to learn robust weather features from minimal observations. It addresses the challenge of modeling complex weather dynamics from small datasets, a bottleneck for many prediction tasks in agriculture, epidemiology, and climate science. WeatherFormer was pretrained on a large pretraining dataset comprised of 39 years of satellite measurements across the Americas. With a novel pretraining task and fine-tuning, WeatherFormer achieves state-of-the-art performance in county-level soybean yield prediction and influenza forecasting. Technical innovations include a unique spatiotemporal encoding that captures geographical, annual, and seasonal variations, adapting the transformer architecture to continuous weather data, and a pretraining strategy to learn representations that are robust to missing weather features. This paper for the first time demonstrates the effectiveness of pretraining large transformer encoder models for weather-dependent applications across multiple domains.
