Table of Contents
Fetching ...

A Dynamic Fusion Model for Consistent Crisis Response

Xiaoying Song, Anirban Saha Anik, Eduardo Blanco, Vanessa Frias-Martinez, Lingzi Hong

TL;DR

This paper addresses the problem of stylistic inconsistency in crisis-response generation by defining a formal consistency metric across professionalism, actionability, and relevance, and introducing a fusion-based generation framework that combines Instructional Prompt and RAG with evaluation-guided synthesis. The fusion mechanism is formalized as $CC(N, D) = \mathcal{L}(\text{Fuse}( M_{\text{IP}}(N), M_{\text{RAG}}(N), \mathbf{s}_{\text{IP}}, \mathbf{s}_{\text{RAG}} ))$, enabling instance-level optimization across critical communicative dimensions. Empirical evaluation on a large Twitter crisis dataset shows fusion-based methods improve overall quality and drastically reduce variation compared with baselines, with cross-crisis generalization demonstrating robustness to earthquakes and typhoons. Human evaluations further corroborate the superiority of fused responses in terms of consistency and usefulness, highlighting practical impact for scalable, trustworthy crisis communication. The work advances crisis informatics by providing a scalable, model-agnostic approach to producing uniformly high-quality guidance during disasters.

Abstract

In response to the urgent need for effective communication with crisis-affected populations, automated responses driven by language models have been proposed to assist in crisis communications. A critical yet often overlooked factor is the consistency of response style, which could affect the trust of affected individuals in responders. Despite its importance, few studies have explored methods for maintaining stylistic consistency across generated responses. To address this gap, we propose a novel metric for evaluating style consistency and introduce a fusion-based generation approach grounded in this metric. Our method employs a two-stage process: it first assesses the style of candidate responses and then optimizes and integrates them at the instance level through a fusion process. This enables the generation of high-quality responses while significantly reducing stylistic variation between instances. Experimental results across multiple datasets demonstrate that our approach consistently outperforms baselines in both response quality and stylistic uniformity.

A Dynamic Fusion Model for Consistent Crisis Response

TL;DR

This paper addresses the problem of stylistic inconsistency in crisis-response generation by defining a formal consistency metric across professionalism, actionability, and relevance, and introducing a fusion-based generation framework that combines Instructional Prompt and RAG with evaluation-guided synthesis. The fusion mechanism is formalized as , enabling instance-level optimization across critical communicative dimensions. Empirical evaluation on a large Twitter crisis dataset shows fusion-based methods improve overall quality and drastically reduce variation compared with baselines, with cross-crisis generalization demonstrating robustness to earthquakes and typhoons. Human evaluations further corroborate the superiority of fused responses in terms of consistency and usefulness, highlighting practical impact for scalable, trustworthy crisis communication. The work advances crisis informatics by providing a scalable, model-agnostic approach to producing uniformly high-quality guidance during disasters.

Abstract

In response to the urgent need for effective communication with crisis-affected populations, automated responses driven by language models have been proposed to assist in crisis communications. A critical yet often overlooked factor is the consistency of response style, which could affect the trust of affected individuals in responders. Despite its importance, few studies have explored methods for maintaining stylistic consistency across generated responses. To address this gap, we propose a novel metric for evaluating style consistency and introduce a fusion-based generation approach grounded in this metric. Our method employs a two-stage process: it first assesses the style of candidate responses and then optimizes and integrates them at the instance level through a fusion process. This enables the generation of high-quality responses while significantly reducing stylistic variation between instances. Experimental results across multiple datasets demonstrate that our approach consistently outperforms baselines in both response quality and stylistic uniformity.

Paper Structure

This paper contains 27 sections, 3 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Examples of responses with high and low professionalism and actionability. Professional responses include explanations backing recommendations, demonstrating authority. Actionable responses offer specific guidance (e.g., phone numbers, website links) that users can follow to seek help. In this paper, we focus on generating consistent responses, i.e., ensuring that professionalism, actionability, and relevance are roughly the same across all responses.
  • Figure 2: Overview of our fusion framework. Initial responses vary in professionalism (red), actionability (purple), and relevance (green); darker indicates higher. The fusion mechanism results in consistent responses that address individual needs and combine the strengths of the initial responses: all users receive responses with high professionalism, actionability, and relevance.
  • Figure 3: Results after one and more iterations of fusion with Eval & Weight Instruct and Llama-3.1-8B-Instruct. Consistency scores are visualized in a mini line chart. Average professionalism, actionability, and relevance remain high from the first iteration. On the other hand, consistency plateaus after three iterations.
  • Figure 4: Multiple rounds of fusion w Eval & Weight Instruct in generalization experiments. The results demonstrate that the method produces stable performance regardless of the number of fusion rounds.