Table of Contents
Fetching ...

Audio Hallucination Attacks: Probing the Reliability of Large Audio Language Models

Ashish Seth, Sonal Kumar, Ramaneswaran Selvakumar, Nishit Anand, Utkarsh Tyagi, Prem Seetharaman, Ramani Duraiswami, Dinesh Manocha

Abstract

Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset, AHA-Guard, which successfully reduces attack success rates by up to 49%.

Audio Hallucination Attacks: Probing the Reliability of Large Audio Language Models

Abstract

Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset, AHA-Guard, which successfully reduces attack success rates by up to 49%.

Paper Structure

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Explicit Vs. Implicit Queries.① Given an audio clip of ocean waves with no seagulls present, Gemini 3 Pro team2023gemini correctly rejects an explicit query about the sound's existence, ② yet when posed an implicit query that presupposes the sound, the model bypasses the crucial grounding step and produces a confident but hallucinated response.
  • Figure 2: Overview of the AHA Data Curation and Attack Generation Pipeline.① Data Filtering: We filter audio clips from AudioCaps audiocaps, Clotho drossos2020clotho, and MusicCaps agostinelli2023musiclm using LLM-based caption consistency checks, resulting in 8K verified audio--caption pairs. ② Hallucinated Sound Generation: For each clip, we use Gemini 3 Pro team2023gemini to generate counterfactual sound events, including two adversarial (contextually plausible) and two random (out-of-context) sounds. ③ QA Construction: We leverage each hallucinated sound to generate both explicit attacks (directly querying the presence of the sound) and implicit attacks (presupposing its existence). ④ Audio Manipulation: Beyond prompt-based attacks, we prepend TTS-synthesized utterances referencing the hallucinated sounds to the original audio, creating acoustically grounded false cues. ⑤ AHA Outputs: Our pipeline produces AHA-Eval (6.5K attack pairs) and AHA-Guard (120K DPO preference pairs) (All the prompt used can be found in the supplementary material).
  • Figure 3: Qualitative Example. An example of a multi-turn conversation with Gemini 3 Pro, where the model, when exposed to a hallucinated audio track, hallucinates and describes non-existent sounds. The error further propagates when the model is asked specific questions about the hallucinated sound.
  • Figure 4: Investigating the cause of hallucinations.(Top) Attention to audio during token generation. (Bottom) Confidence of hallucinated "Yes" responses.