Speech & voice – Page 4

Alibaba’s Marco-Voice combines voice cloning with controllable emotion, delivering more natural and expressive synthetic speech...

Stanford and UC Santa Cruz launch a benchmark for audio-language models, with Google’s Gemini 2.5...

Microsoft’s VibeVoice is an open-source text-to-speech model that generates podcast-length audio with up to four...

Free, open-source transcription app Whispering lets users keep recordings on-device or connect directly to providers...

The company now enables users to have both consecutive and real-time conversations in over 70...

At IWSLT 2025, researchers and industry share ways to boost speed, quality, subtitles, model size,...

ByteDance introduces a product-ready AI live speech translation system that delivers near-human accuracy, real-time voice...

Mistral releases Voxtral, a new family of open-source models for AI speech translation and transcription,...

MiniMax is one of China’s “Six Tigers” — six AI firms at the top of...

Microsoft introduces a voice conversion feature in Azure AI Speech, allowing users to transform recorded...

A new Google study finds that multilingual speech datasets suffer from serious data quality issues...

A short time after encouraging its use among healthcare practitioners, NHS England has asked them...

OpenAI's Altman called a trademark lawsuit by iyO “silly,” citing iyO's past “babelfish” app collab...

The deal, negotiated by the Screen Actors Guild-American Federation of Television and Radio Artists, includes...

Researchers at ZHAW fine-tuned a multilingual text-to-speech model on nearly 5,000 hours of podcast audio...