OmniGuard: Unified Omni-Modal Guardrails with Deliberate Reasoning
Boyu Zhu, Xiaofei Wen, Wenjie Jacky Mo, Tinghui Zhu, Yanan Xie, Peng Qi, Muhao Chen
TL;DR
OmniGuard introduces the first unified omni-modal guardrails with deliberate reasoning to moderate safety across text, image, video, and audio. It builds a large omni-modal safety dataset and uses targeted distillation from expert models, followed by mission-focused instruction tuning, to train OmniGuard-7B and OmniGuard-3B. The approach achieves state-of-the-art or competitive performance across 15 safety benchmarks, with strong cross-modal generalization and interpretable reasoning for safety judgments. The work advances robust, explainable safety moderation for multi-modal AI systems and highlights areas for efficiency improvements and richer cross-modal data. This framework lays groundwork for scalable, policy-grounded, cross-modal safeguarding in next-generation omnimodal models.
Abstract
Omni-modal Large Language Models (OLLMs) that process text, images, videos, and audio introduce new challenges for safety and value guardrails in human-AI interaction. Prior guardrail research largely targets unimodal settings and typically frames safeguarding as binary classification, which limits robustness across diverse modalities and tasks. To address this gap, we propose OmniGuard, the first family of omni-modal guardrails that performs safeguarding across all modalities with deliberate reasoning ability. To support the training of OMNIGUARD, we curate a large, comprehensive omni-modal safety dataset comprising over 210K diverse samples, with inputs that cover all modalities through both unimodal and cross-modal samples. Each sample is annotated with structured safety labels and carefully curated safety critiques from expert models through targeted distillation. Extensive experiments on 15 benchmarks show that OmniGuard achieves strong effectiveness and generalization across a wide range of multimodal safety scenarios. Importantly, OmniGuard provides a unified framework that enforces policies and mitigates risks in omni-modalities, paving the way toward building more robust and capable omnimodal safeguarding systems.
