Voice Cloning Meets Emotional Speech Synthesis With Alibaba’s Marco-Voice Model
Alibaba’s Marco-Voice combines voice cloning with controllable emotion, delivering more natural and expressive synthetic speech...
Dubbing and subtitling news from streaming, gaming etc.
Alibaba’s Marco-Voice combines voice cloning with controllable emotion, delivering more natural and expressive synthetic speech...
Stanford and UC Santa Cruz launch a benchmark for audio-language models, with Google’s Gemini 2.5...
Microsoft’s VibeVoice is an open-source text-to-speech model that generates podcast-length audio with up to four...
Free, open-source transcription app Whispering lets users keep recordings on-device or connect directly to providers...
The company now enables users to have both consecutive and real-time conversations in over 70...
At IWSLT 2025, researchers and industry share ways to boost speed, quality, subtitles, model size,...
ByteDance introduces a product-ready AI live speech translation system that delivers near-human accuracy, real-time voice...
Mistral releases Voxtral, a new family of open-source models for AI speech translation and transcription,...
MiniMax is one of China’s “Six Tigers” — six AI firms at the top of...
Microsoft introduces a voice conversion feature in Azure AI Speech, allowing users to transform recorded...
A new Google study finds that multilingual speech datasets suffer from serious data quality issues...
A short time after encouraging its use among healthcare practitioners, NHS England has asked them...
OpenAI's Altman called a trademark lawsuit by iyO “silly,” citing iyO's past “babelfish” app collab...
The deal, negotiated by the Screen Actors Guild-American Federation of Television and Radio Artists, includes...