tech

February 5, 2026

Gemini Audio

Engage in fluid, natural conversation with a model that listens, reasons, and responds in real-time.

Gemini Audio

TL;DR

  • Live audio agents for fluid, real-time conversation.
  • Expressive audio generation with control over style, tone, and performance.
  • Live speech translation in over 70 languages, preserving speaker characteristics.
  • Audio understanding for summarizing events and extracting data from audio files.
  • Accurate speaker separation and sentiment analysis in audio.
  • Real-time action capabilities using tools and function calls.
  • Conversation context awareness and robust steerability for consistent personas.
  • Multi-speaker generation for creating dialogues.
  • Automatic language detection and noise robustness for translation.
  • Safety measures and SynthID watermarking for audio outputs.

Continue reading the original article