Speech & Voice

Latest speech & voice

Innovations in AI speech and AI voice, including live translation, voice cloning, speech-to-speech models, and speech-to-text

Array ( [post_type] => post [posts_per_page] => 12 [paged] => 1 [post_status] => publish [tax_query] => Array ( [relation] => AND [0] => Array ( [taxonomy] => category [field] => term_id [terms] => 39439 [operator] => IN ) ) )

How Voice AI Companies Use Live Events to Build Trust and Shape Buyer Thinking

Voice AI companies that consistently host and actively participate in professional events gain early visibility...

Breakfasts, Game Nights, Car Rides: The PECII Corpus for Natural Language Interactions

Researchers from three universities collected a wealth of natural human interactions across four languages and...

OpenAI Updates Realtime Model to Improve Multilingual Voice Agent Reliability

OpenAI updates its gpt-realtime voice model to improve reliability in multilingual voice agents for customer...

Prompt-Based Control Reaches Enterprise Speech-to-Text

Recent releases from Microsoft and AssemblyAI reflect growing interest in structured, configurable speech recognition as...

What Text-to-Speech Evaluation Misses in Real-World Deployment

Researchers argue that text-to-speech evaluation under-tests what matters in real-world deployment — and propose a...

Microsoft Unveils VibeVoice-ASR for Long-Form, Multi-Speaker Transcription

Microsoft releases VibeVoice-ASR, an open-source speech-to-text model designed for long-form audio, structured transcription, and customised...

Alibaba Expands Speech Stack with New Open-Source Models

Alibaba’s Qwen team releases a new set of open-source speech models under the Qwen3 family,...

Researchers Begin Benchmarking Cross-Lingual Voice Cloning

A new cross-lingual voice cloning track at IWSLT 2026 highlights changing priorities in multilingual speech...

NVIDIA Doubles Down on Open Speech AI

NVIDIA doubles down on open speech AI with ultra-low-latency automatic speech recognition and multilingual text-to-speech...

Google Launches MedASR, an Open Medical Speech-to-Text Model

Google introduces MedASR, an open-weight medical speech-to-text model positioned as a foundational layer for healthcare...

Resemble AI Open-Sources Chatterbox Turbo

Resemble AI releases an open-source text-to-speech model designed for real-time, expressive voice generation and positioned...

Cascades Still Outperform SpeechLLMs in Translation, Research Finds

A comprehensive new evaluation finds that cascaded speech translation systems still deliver more consistent results...

1 2 3 4
0
    0
    Your Cart
    Your cart is empty
    Privacy Overview

    This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.