Meta Doubles Down on Direct Speech-to-Speech Translation

The social media giant described the approach as the first S2ST framework “trained on real-world open sourced audio data.” It is now being tested using the University of Pennsylvania’s Fisher Spanish-English speech translation corpus, an audio database of 139,000 sentences from phone conversations in Spanish.

Scientists involved in this and similar projects at Meta claim that, until now, S2ST systems had not been successfully trained with “publicly available real-world data on multiple languages.”

The implications for this advancement are many, including language-neutral connectivity across live action platforms for business or leisure — while transforming the interpreting landscape a lot sooner than many anticipate.

Meta researchers expect their novel speech-to-speech translation research will make a difference in translation quality, language conversion speed, and improved communication for users.

In a sort of surreptitious application crowdsourcing, it has made available free of charge all related papers and code on the blog post, stating its “hope to enable future direct speech-to-speech translation advancements across the research community.”

Whether in the hands of the lone developer, techie entrepreneur, or academic researcher, a scientific breakthrough of this nature has the potential of shortening the path to multilingual implementations within the “Metaverse” and beyond.

Featured