Voice Cloning Meets Emotional Speech Synthesis With Alibaba’s Marco-Voice Model
Alibaba’s Marco-Voice combines voice cloning with controllable emotion, delivering more natural and expressive synthetic speech...
Alibaba’s Marco-Voice combines voice cloning with controllable emotion, delivering more natural and expressive synthetic speech...
Microsoft’s VibeVoice is an open-source text-to-speech model that generates podcast-length audio with up to four...
MiniMax is one of China’s “Six Tigers” — six AI firms at the top of...
Google’s Gemini 2.5 delivers stronger performance in CJK and Indic languages, better output language control,...
Researchers at ZHAW fine-tuned a multilingual text-to-speech model on nearly 5,000 hours of podcast audio...
IIT Bombay researchers propose a new approach to speech-to-speech translation that not only translates speech...
The seed round funding boosts Panjaya’s product development, AI research, and user experience, as it...
Apple researchers find that speech translation systems struggle with prosody, affecting accuracy, and recommend improvements...
The start-up launches its text-to-speech tool and claims ultra-low latency within months of operating, as...
The chip manufacturer — the third most valuable company in the world — now offers...
Researchers from Sony and IIIT present DubWise, a method that combines large language models with...