tech
December 18, 2025
Grok Voice Agent API
Bringing the power of Grok Voice to all developers.

TL;DR
- Grok Voice Agent API launched for developers.
- Built on the same stack as Grok Voice in mobile apps and Tesla vehicles.
- Features multilingual capabilities (dozens of languages) with native-level proficiency and seamless language switching.
- Ranks #1 on Big Bench Audio benchmark for audio reasoning.
- Offers an average time-to-first-audio of less than 1 second, significantly faster than competitors.
- Industry-leading cost-efficiency at $0.05 per minute.
- Consistently preferred over OpenAI Realtime API in human evaluations for pronunciation, accent, and prosody.
- Designed with Tesla as a critical design partner, powering Grok in millions of vehicles.
- Enables integration of custom tools and leverages xAI's real-time search across X and the web.
- Offers multiple expressive voices (Ara, Eve, Leo) and auditory cues for enhanced realism.
- Compatible with OpenAI Realtime API specification and available via xAI LiveKit Plugin.
- Upcoming releases include standalone text-to-speech and speech-to-text endpoints, and improved audio models.
Continue reading
the original article