The company says it remains closely connected to Kyutai, giving its team a “fast, direct path” for turning foundational work in generative audio into deployable products. “After 10 years of research in frontier labs, I’m now focused as CEO on bringing our models to production,” Zeghidour wrote in a LinkedIn post.
Inside Gradium’s Voice AI Offering
Gradium is releasing production-ready speech-to-text (STT) and text-to-speech (TTS) models supporting English, French, Spanish, Portuguese, and German — with more languages currently in development.
Key capabilities include:
- Live AI audio transcription with semantic voice activity detection for smart turn-taking, noise robustness, and code-switching support
- High-quality, low-latency speech synthesis
- Instant voice cloning from a 10-second audio sample, with up to 1,000 clones depending on plan.
- A library of male and female voices across multiple locales.
Sessions can run up to 300 seconds, while longer content can be handled by splitting into multiple sessions.
The platform supports everything from rapid prototyping through its API to full enterprise deployments for high-volume production workloads.
“Our models and platform are designed to deliver ultra-realistic, emotionally expressive speech with low latency, while remaining efficient and scalable so high-quality voice can be broadly affordable and widely available,” the company claims.
Gradium says its real-time transcription and synthesis APIs are already used in production by studios and game developers for immersive characters, by language platforms for instant translation and natural-sounding voices, and by healthcare innovators exploring low-latency conversational assistants with privacy guarantees. Additional use cases include customer care, market research, e-learning, and digital advertising.
2025 Slator Pro Guide: Translation AI
The 2025 Slator Pro Guide Translation AI presents 15 impactful ways that AI can be used to enhance translation workflows.
The Competitive Voice AI Landscape
Gradium enters a rapidly evolving voice AI market that includes OpenAI, ElevenLabs, Deepgram, Mistral, and Google, among others.
Its offering — real-time STT, low-latency TTS, and voice cloning — aligns with what buyers now expect from a voice AI stack.
However, beyond company materials and demos, there are no independent benchmarks yet comparing Gradium with established providers. So, whether it becomes a significant voice AI player will depend on its ability to match — and prove — performance against competitors in real-world deployments.
More broadly, Gradium’s launch is another example confirming that voice is still a major front in AI, as Voiseed CEO and Co-founder Andrea Ballista noted at SlatorCon Remote December 2025.
A Signal for Europe’s Voice AI Ecosystem
Some also see Gradium’s launch as part of a broader European push toward sovereign AI infrastructure.
Daniel Jarjoura, Founder of the Unicorn CTO community, noted in a LinkedIn post that Gradium’s launch and funding underscore the maturity of the European voice AI stack, which now covers every layer — from audio preprocessing, to STT, TTS, and orchestration — using European or open-source tools.
According to Jarjoura, Europe has “every building block to deploy sovereign, production-grade Voice AI” with replaceable vendors and no hyperscaler lock-in.
“Gradium’s round is not an exception,” he wrote, but “a signal that the European Voice AI stack is mature, and ready for real deployments.”