ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

Pei Deng; Wenqian Zhou; Hanlin Wu

ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

Pei Deng, Wenqian Zhou, Hanlin Wu

TL;DR

This work introduces ChangeChat, the first bitemporal vision-language model (VLM) specifically designed for interactive RS change analysis, which achieves performance comparable to or surpassing state-of-the-art (SOTA) methods on specific tasks, while significantly outperforming the latest general-domain model, GPT-4.

Abstract

Remote sensing (RS) change analysis is vital for monitoring Earth's dynamic processes by detecting alterations in images over time. Traditional change detection excels at identifying pixel-level changes but lacks the ability to contextualize these alterations. While recent advancements in change captioning offer natural language descriptions of changes, they do not support interactive, user-specific queries. To address these limitations, we introduce ChangeChat, the first bitemporal vision-language model (VLM) designed specifically for RS change analysis. ChangeChat utilizes multimodal instruction tuning, allowing it to handle complex queries such as change captioning, category-specific quantification, and change localization. To enhance the model's performance, we developed the ChangeChat-87k dataset, which was generated using a combination of rule-based methods and GPT-assisted techniques. Experiments show that ChangeChat offers a comprehensive, interactive solution for RS change analysis, achieving performance comparable to or even better than state-of-the-art (SOTA) methods on specific tasks, and significantly surpassing the latest general-domain model, GPT-4. Code and pre-trained weights are available at https://github.com/hanlinwu/ChangeChat.

ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

TL;DR

Abstract

Paper Structure (12 sections, 2 figures, 4 tables)

This paper contains 12 sections, 2 figures, 4 tables.

Introduction
Methodology
RS Change Instruction Dataset Generation
Instruction Tuning for ChangeChat
ChangeChat Architecture
Training Details
Experimental results
Dataset and Evaluation Metrics
Comparison with SOTA Change Captioning Models
Evaluation on Diverse Change Analysis Tasks
Discussion on CoT reasoning
Conclusion

Figures (2)

Figure 1: Overview of the proposed ChangeChat. The left side illustrates the network architecture, while the right side shows examples of various types of change analysis.
Figure 2: Two examples of change localization are provided, with the generated coordinates visualized.

ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

TL;DR

Abstract

ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

Authors

TL;DR

Abstract

Table of Contents

Figures (2)