Table of Contents
Fetching ...

Missing-Aware Multimodal Fusion for Unified Microservice Incident Management

Wenzhuo Qian, Hailiang Zhao, Ziqi Wang, Zhipeng Gao, Jiayi Chen, Zhiwei Ling, Shuiguang Deng

Abstract

Automated incident management is critical for microservice reliability. While recent unified frameworks leverage multimodal data for joint optimization, they unrealistically assume perfect data completeness. In practice, network fluctuations and agent failures frequently cause missing modalities. Existing approaches relying on static placeholders introduce imputation noise that masks anomalies and degrades performance. To address this, we propose ARMOR, a robust self-supervised framework designed for missing modality scenarios. ARMOR features: (i) a modality-specific asymmetric encoder that isolates distribution disparities among metrics, logs, and traces; and (ii) a missing-aware gated fusion mechanism utilizing learnable placeholders and dynamic bias compensation to prevent cross-modal interference from incomplete inputs. By employing self-supervised auto-regression with mask-guided reconstruction, ARMOR jointly optimizes anomaly detection (AD), failure triage (FT), and root cause localization (RCL). AD and RCL require no fault labels, while FT relies solely on failure-type annotations for the downstream classifier. Extensive experiments demonstrate that ARMOR achieves state-of-the-art performance under complete data conditions and maintains robust diagnostic accuracy even with severe modality loss.

Missing-Aware Multimodal Fusion for Unified Microservice Incident Management

Abstract

Automated incident management is critical for microservice reliability. While recent unified frameworks leverage multimodal data for joint optimization, they unrealistically assume perfect data completeness. In practice, network fluctuations and agent failures frequently cause missing modalities. Existing approaches relying on static placeholders introduce imputation noise that masks anomalies and degrades performance. To address this, we propose ARMOR, a robust self-supervised framework designed for missing modality scenarios. ARMOR features: (i) a modality-specific asymmetric encoder that isolates distribution disparities among metrics, logs, and traces; and (ii) a missing-aware gated fusion mechanism utilizing learnable placeholders and dynamic bias compensation to prevent cross-modal interference from incomplete inputs. By employing self-supervised auto-regression with mask-guided reconstruction, ARMOR jointly optimizes anomaly detection (AD), failure triage (FT), and root cause localization (RCL). AD and RCL require no fault labels, while FT relies solely on failure-type annotations for the downstream classifier. Extensive experiments demonstrate that ARMOR achieves state-of-the-art performance under complete data conditions and maintains robust diagnostic accuracy even with severe modality loss.

Paper Structure

This paper contains 32 sections, 11 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Overview of the multimodal incident management pipeline. Diverse observability data (metrics, logs, and traces) drives three sequential tasks: AD, FT, and RCL.
  • Figure 2: The generation of missing modalities in production environments. This timeline illustrates how vulnerabilities in the observability infrastructure (e.g., agent crashes or aggressive sampling) cause abrupt data loss, even while the underlying microservice operates normally and outputs logs.
  • Figure 3: Performance degradation of ART sun2024art, the strongest self-supervised unified baseline, under missing modalities. Metric loss causes a steep drop in RCL accuracy, confirming that static imputation cannot compensate for absent continuous streams.
  • Figure 4: The impact of static imputation on latent feature distributions. A t-SNE visualization demonstrates how non-adaptive default values force the features of anomalous instances into the normal cluster, creating spurious correlations that suppress actual diagnostic signals.
  • Figure 5: Overview of the proposed missing-aware incident management framework. It comprises three core modules: (1) Modality-Specific Status Learning, extracting intra-modality features via an asymmetric encoder; (2) Missing-Aware Global Fusion, integrating isolated embeddings through an attention-guided gating mechanism and a topology-aware graph network; and (3) Diagnostic Tasks, optimizing the network via offline self-supervised reconstruction and executing online AD, FT, and RCL using the unified representations.
  • ...and 3 more figures