tech
April 2, 2026
Microsoft takes on AI rivals with three new foundational models
MAI released models that can transcribe voice into text as well as generate audio and images after the group's formation six months ago.

TL;DR
- Microsoft AI released three new foundational AI models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2.
- MAI-Transcribe-1 transcribes speech in 25 languages and is 2.5 times faster than Azure Fast.
- MAI-Voice-1 generates 60 seconds of audio in one second and allows custom voice creation.
- MAI-Image-2 is a video-generating model.
- These models are available on Microsoft Foundry and MAI Playground, aiming to be cheaper than Google and OpenAI offerings.
- The models were developed by Microsoft's MAI Superintelligence team, led by Mustafa Suleyman.
- Microsoft reaffirms its commitment to its partnership with OpenAI, despite developing its own AI models.
Continue reading the original article