tech
May 8, 2026
OpenAI's New Models Listen, Translate & Act in Real Time
OpenAI is introducing three audio models for its developer platform that will create conversational voice agents that listen, reason and act in real time

TL;DR
- OpenAI introduces three new audio models for its developer platform: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper.
- These models enable the creation of conversational voice agents that can listen, reason, and act in real time.
- GPT-Realtime-2 offers GPT-5 class reasoning for natural conversation flow and complex requests.
- GPT-Realtime-Translate provides live translation from over 70 languages to 13 output languages, useful for customer support and education.
- GPT-Realtime-Whisper is a streaming speech-to-text model for live transcription, aiding in captions and meeting notes.
- Priceline and Deutsche Telekom are among the companies exploring these new voice AI capabilities.
- Safeguards against misuse, including harmful content classifiers and usage policies, are integrated into the Realtime API.
- Developers must ensure end users are aware when interacting with AI.