ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis

Jian Chen; Peilin Zhou; Yining Hua; Dading Chong; Meng Cao; Yaowei Li; Wei Chen; Bing Zhu; Junwei Liang; Zixuan Yuan

ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis

Jian Chen, Peilin Zhou, Yining Hua, Dading Chong, Meng Cao, Yaowei Li, Wei Chen, Bing Zhu, Junwei Liang, Zixuan Yuan

TL;DR

This paper tackles the challenge of interpreting meteorological heatmaps with Vision-Language Models, where irregular shapes and vivid color schemes hinder generic models. It introduces SPOT, a Sparse Position and Outline Tracking algorithm, to faithfully localize irregular colored regions, and ClimateIQA, a large-scale, geo-grounded VQA dataset built from ERA5 reanalysis data and geographic metadata. Through instruction-tuning on ClimateIQA, the authors present Climate-Zoo, a family of fine-tuned VLMs based on LLaVA-1.6, Qwen-VL-Chat, and Yi-VL-6B that achieve state-of-the-art performance on wind gust, precipitation, and wind chill/heat index heatmaps. They evaluate with task-specific metrics (F1, Element Match, Haversine, BLEU/ROUGE, GPT-4o) and extensive ablations, showing substantial gains over baselines. The work promises practical impact for meteorology and disaster mitigation by enabling accurate, geo-located visual reasoning on weather heatmaps.

Abstract

Meteorological heatmaps play a vital role in deciphering extreme weather phenomena, yet their inherent complexities marked by irregular contours, unstructured patterns, and complex color variations present unique analytical hurdles for state-of-the-art Vision-Language Models (VLMs). Current state-of-the-art models like GPT-4o, Qwen-VL, and LLaVA 1.6 struggle with tasks such as precise color identification and spatial localization, resulting in inaccurate or incomplete interpretations. To address these challenges, we introduce Sparse Position and Outline Tracking (SPOT), a novel algorithm specifically designed to process irregularly shaped colored regions in visual data. SPOT identifies and localizes these regions by extracting their spatial coordinates, enabling structured representations of irregular shapes. Building on SPOT, we construct ClimateIQA, a novel meteorological visual question answering (VQA) dataset, comprising 26,280 high-resolution heatmaps and 762,120 instruction samples for wind gust, total precipitation, wind chill index and heat index analysis. ClimateIQA enhances VLM training by incorporating spatial cues, geographic metadata, and reanalysis data, improving model accuracy in interpreting and describing extreme weather features. Furthermore, we develop Climate-Zoo, a suite of fine-tuned VLMs based on SPOT-empowered ClimateIQA, which significantly outperforms existing models in meteorological heatmap tasks.

ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis

TL;DR

Abstract

Paper Structure (22 sections, 4 equations, 8 figures, 4 tables, 1 algorithm)

This paper contains 22 sections, 4 equations, 8 figures, 4 tables, 1 algorithm.

Introduction
Related work
AI for meteorology
Vision language models and visual question answering
Preliminary Investigation of VLM Capabilities on Meteorological Heatmaps
ClimateIQA: Dataset Building Pipeline
Data collection
Sparse Position and Outline Tracking (SPOT)
Instruction-tuning data construction
Dataset statistics
Climate-Zoo: Adapting VLMs to meteorology
Evaluation
Metrics
Results and analysis
Conclusions
...and 7 more sections

Figures (8)

Figure 1: Comparative Analysis of Visual Chat and Reasoning Abilities in Meteorological Anomalies Analysis. Regions marked in yellow indicate strong breezes, red indicates hurricanes, and green indicates moderate breezes. In the conversation, hallucinations are marked in blue, refusal-to-answer responses are marked in red, and accurate responses are marked in green.
Figure 2: Result of an in-depth evaluation via Prompt-Engineering GPT-4o. Sentences in red mark inaccurate responses, sentences in orange and black mean surprising findings (patches and geography information), and sentences in green mark accurate answers.
Figure 3: The process of constructing ClimateIQA. Images were processed using SPOT to extract color contours (marked in blue) and representative point coordinates (marked in purple), such as (-40, 65). The extracted data were integrated into geographic knowledge bases to retrieve location-specific information. These data, including location, coordinates, and weather variables, were then input into predefined question-and-answer templates, resulting in the generation of 762,120 question-answer pairs. The final dataset, ClimateIQA, pairs these QA pairs with 26,280 images, enabling comprehensive visual question answering.
Figure 4: The SPOT algorithm identifies representative points (enlarged purple dots) within strong gale zones (light coral) from a high-resolution image, with deep blue outlines precisely tracing the contours, showing alignment of points with the contours.
Figure 5: Low-resolution results of the SPOT algorithm. SPOT accurately outlines the shapes of light coral areas within low-resolution image, but selects fewer representative points compared to those from high-resolution heatmaps.
...and 3 more figures

ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis

TL;DR

Abstract

ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (8)