Interpolation of mountain weather forecasts by machine learning
Kazuma Iwase, Tomoyuki Takenawa
TL;DR
This work tackles rain- and temperature-forecast accuracy in mountainous regions by interpolating forecasts from surrounding plains using regression with lag features and a mixed loss that blends $\text{MSE}$ and a shifted binary cross-entropy term. Among tested models, LightGBM achieves the best balance of accuracy and training efficiency on a small, multi-site dataset centered on Mt. Fuji and Hakone, with predictions made for horizons of $2$, $7$, $8$, and $9$ hours ahead. The study demonstrates that including forecast data from nearby plains can outperform some public forecast services for mountain temperature and, in some horizons, precipitation, while also mitigating bias in zero-rain cases via the $L = \alpha\,\text{MSE} + (1-\alpha) L_{binary}$ loss. Overall, the approach offers a practical, data-efficient path to improve mountain weather predictions, though results are constrained by data size and the reliability of surrounding forecasts.
Abstract
Recent advances in numerical simulation methods based on physical models and their combination with machine learning have improved the accuracy of weather forecasts. However, the accuracy decreases in complex terrains such as mountainous regions because these methods usually use grids of several kilometers square and simple machine learning models. While deep learning has also made significant progress in recent years, its direct application is difficult to utilize the physical knowledge used in the simulation. This paper proposes a method that uses machine learning to interpolate future weather in mountainous regions using forecast data from surrounding plains and past observed data to improve weather forecasts in mountainous regions. We focus on mountainous regions in Japan and predict temperature and precipitation mainly using LightGBM as a machine learning model. Despite the use of a small dataset, through feature engineering and model tuning, our method partially achieves improvements in the RMSE with significantly less training time.
