Towards Temporal Change Explanations from Bi-Temporal Satellite Images

Ryo Tsujimoto; Hiroki Ouchi; Hidetaka Kamigaito; Taro Watanabe

Towards Temporal Change Explanations from Bi-Temporal Satellite Images

Ryo Tsujimoto, Hiroki Ouchi, Hidetaka Kamigaito, Taro Watanabe

TL;DR

Explores generating temporal-change explanations from bi-temporal satellite images using LVLMs despite single-image input constraints. Proposes three prompting strategies—All-at-Once, Step-by-Step, and Hybrid—and evaluates them with automatic noun-coverage and manual truthfulness/informativeness on the Levir-CC dataset. Finds Step-by-Step prompting with LVLMs yields the strongest overall explanations, while All-at-Once with GPT-4V excels in truthfulness and informativeness; Hybrid prompting offers strong coverage. Highlights over-explanation as a challenge when changes are minimal and suggests future work on refined evaluation, task-tailored preprocessing, and leveraging multi-image LVLMs for richer temporal explanations.

Abstract

Explaining temporal changes between satellite images taken at different times is important for urban planning and environmental monitoring. However, manual dataset construction for the task is costly, so human-AI collaboration is promissing. Toward the direction, in this paper, we investigate the ability of Large-scale Vision-Language Models (LVLMs) to explain temporal changes between satellite images. While LVLMs are known to generate good image captions, they receive only a single image as input. To deal with a par of satellite images as input, we propose three prompting methods. Through human evaluation, we found the effectiveness of our step-by-step reasoning based prompting.

Towards Temporal Change Explanations from Bi-Temporal Satellite Images

TL;DR

Abstract

Paper Structure (25 sections, 6 figures, 5 tables)

This paper contains 25 sections, 6 figures, 5 tables.

Introduction
Method
All-at-Once Prompting
Step-by-Step Prompting
Hybrid Prompting
Experiments
Experimental Setup
Models
Dataset
Prompt constraints
Evaluation Metrics
Automatic evaluation
Manual evaluation
Experimental Results
Overall
...and 10 more sections

Figures (6)

Figure 1: Example of bi-temporal satellite images with their captions in Levir-CC; the left is the one before the change and the right is the one after the change.
Figure 2: Explaining temporal changes from bi-temporal SI using two types of prompting
Figure 3: Example of Truthfulness score of 1
Figure 4: Example of Informativeness score of 1
Figure 5: Example output for informativeness score of 1
...and 1 more figures

Towards Temporal Change Explanations from Bi-Temporal Satellite Images

TL;DR

Abstract

Towards Temporal Change Explanations from Bi-Temporal Satellite Images

Authors

TL;DR

Abstract

Table of Contents

Figures (6)