Inaudible Audio Hijack Attacks Target AI Voice Models
Researchers at Zhejiang University developed AudioHijack, a method to embed hidden commands in audio that deceive AI voice models with up to 96% success. The attack sidesteps text-based defenses by altering inaudible signal properties, targeting open-source and commercial systems from Microsoft and Mistral.
Quick Take
AudioHijack embeds hidden commands in audio to manipulate AI voice models.
Attack achieves 79–96% success rate across 13 open-source models.
Commercial systems from Microsoft and Mistral were also vulnerable.
Defenses are limited; attention monitoring is partially effective.
Market Impact Analysis
NeutralThe news is about AI security, with no direct crypto market implications.
Speculation Analysis
Key Takeaways
- AudioHijack embeds hidden commands in audio to manipulate AI voice models with up to 96% success rate.
- The attack bypasses text-based safeguards, exploiting inaudible signal alterations undetectable to humans.
- 13 open-source models and commercial systems from Microsoft and Mistral were all compromised.
- Existing defenses are largely ineffective, with attacks trained in just 30 minutes.
What Happened
Researchers at Zhejiang University unveiled AudioHijack, a novel attack that injects imperceptible commands into audio clips to seize control of AI voice models. Presented at the 47th IEEE Symposium on Security and Privacy in San Francisco, the method manipulates digital audio waveforms in ways humans cannot hear but AI models interpret as instructions. The attack does not alter the user’s spoken input; instead, it fuses hidden signals into the audio itself, allowing an adversary to redirect or override model behavior even during someone else’s session. This means a corrupted video, voice note, or Zoom recording could silently trigger malicious actions like spreading misinformation, stealing data, or performing unauthorized transactions.
The Numbers
Tests across 13 open-source large audio-language models showed a 79–96% success rate in hijacking model outputs. The adversarial signal requires only 30 minutes to train and is context-agnostic, meaning it works regardless of the legitimate user prompt. Commercial voice AI systems from Microsoft and Mistral were also vulnerable, highlighting that the risk extends beyond open-source ecosystems. Common defenses—such as filtering or adversarial training—stopped only a fraction of attempts, leaving the majority of attacks unchallenged.
Why It Happened
AudioHijack exploits a fundamental gap between human and machine perception. AI models process audio as numerical matrices, and slight perturbations—inaudible to ears—can drastically alter the model’s interpretation. Because the attack modifies the acoustic signal rather than text prompts, it completely sidesteps safety filters designed to catch malicious natural language. The technique’s efficiency stems from its universality: a single trained perturbation can be applied to any audio input, making it deployable at scale.
Broader Impact
The research team is now probing whether the attack can leap to closed models from OpenAI and Anthropic via shared open-source audio processing components. If successful, the vulnerability could become an industry-wide threat, forcing a rethink of how voice AI models are audited and secured. This raises urgent questions about the safety of AI assistants that increasingly operate with access to personal data, devices, and financial accounts.
What to Watch Next
- Cross-model transferability: Results from ongoing tests against proprietary models will define the real-world risk surface.
- Defensive innovation: Expect a wave of research into audio-specific adversarial detectors and input sanitization techniques.
- Regulatory response: As AI agents gain autonomy, incidents like these could accelerate calls for mandatory security audits.
This article is for informational purposes only and does not constitute financial advice.
Always late to trends?
Join for the latest news, insights & more.
Disclaimer: Bytewit is an independent media outlet that delivers news, research, and data.
© 2026 Bytewit. All Rights Reserved. This article is for informational purposes only.