Speech & Voice

Latest speech & voice

Innovations in AI speech and AI voice, including live translation, voice cloning, speech-to-speech models, and speech-to-text

Voice AI companies that consistently host and actively participate in professional events gain early visibility...

Researchers from three universities collected a wealth of natural human interactions across four languages and...

OpenAI updates its gpt-realtime voice model to improve reliability in multilingual voice agents for customer...

Recent releases from Microsoft and AssemblyAI reflect growing interest in structured, configurable speech recognition as...

Researchers argue that text-to-speech evaluation under-tests what matters in real-world deployment — and propose a...

Microsoft releases VibeVoice-ASR, an open-source speech-to-text model designed for long-form audio, structured transcription, and customised...

Alibaba’s Qwen team releases a new set of open-source speech models under the Qwen3 family,...

A new cross-lingual voice cloning track at IWSLT 2026 highlights changing priorities in multilingual speech...

NVIDIA doubles down on open speech AI with ultra-low-latency automatic speech recognition and multilingual text-to-speech...

Google introduces MedASR, an open-weight medical speech-to-text model positioned as a foundational layer for healthcare...

Resemble AI releases an open-source text-to-speech model designed for real-time, expressive voice generation and positioned...

A comprehensive new evaluation finds that cascaded speech translation systems still deliver more consistent results...

1 2 3 4