AI Voice Clones More Intelligible Than Humans in Noise, Study Finds

The study compared ten human voices with AI-generated clones of the same speakers, using a commercial voice cloning system (ElevenLabs). The experiments were conducted in British English. In listening tests, participants consistently recognized more words correctly when hearing the cloned versions, and the advantage held across all tested noise conditions.

The findings also suggest that recent advances in voice AI are not only improving how natural voices sound but may also improve speech clarity. The researchers attribute this to more stable pitch and cleaner sound patterns in cloned voices, making them easier to distinguish from background noise.

Although cloned voices were rated as slightly clearer than human voices, listeners could still tell the difference between human and cloned voices in around 70% of cases, meaning clearer speech does not necessarily sound fully human.

At the same time, cloned voices were perceived as more regionally accented than the human originals, suggesting that improved clarity does not necessarily result in more neutral or standard speech.

Slator Data-for-AI Market Report

This 160-page Slator Report provides a comprehensive view of the emerging global market for Data-for-AI with analysis of datasets, buyer demand, supplier dynamics, and data production.

$890 BUY NOW Included in our Pro and Enterprise plan.
Subscribe now!

Accessibility and Other Applications

The researchers point to a range of accessibility-related applications for voice cloning, including voice restoration for individuals who have lost their ability to speak and speech synthesis for non-verbal users.

They also suggest that clearer cloned speech could benefit people with hearing loss and could be integrated into assistive communication technologies, including hearing aids and cochlear implants, to improve speech perception in noise.

The researchers also highlight potential use in public announcement systems, where speech clarity is essential for conveying information effectively.

“Our findings provide an empirical foundation for deploying synthetic voices in assistive and communicative applications (e.g., hearing aids, emergency announcement systems), while underscoring the need for future work to maintain expressive nuance, naturalness, and similarity during voice cloning,” they said.

The findings may also be relevant for industries such as media localization and dubbing, where speech intelligibility is a key factor.

The researchers note that the study was conducted under controlled conditions using standardized sentences and one voice cloning system. Results may vary across real-world settings, languages, and technologies.

They also acknowledge broader risks associated with voice cloning, including potential misuse for fraud, impersonation, and misinformation, as well as privacy concerns from unauthorized voice cloning that could undermine trust in voice-based systems.

Featured