During Google’s annual developer conference, Google I/O 2025, held on May 20-21, 2025, the company announced real-time, English </> Spanish speech-to-speech translation within Google Meet.
At the time, the company did not reveal any technical details. The demo video shown to users in May is the same one embedded in the new post — a strong hint that the underlying system has been in circulation for months. The feature initially rolled out in beta to users on the Google AI Pro plan.
In September, Google announced four new languages for Google Meet’s AI live speech translation — Italian, German, French, and Portuguese — and shared that the system had moved “from AI research to reality.”
Over the summer, Google also announced that its new Pixel 10 phones can translate telephone conversations in real time using each speaker’s voice and intonation. Once again, the company did not reveal any technical details.
Google Connects the Dots
With this week’s research post, Google is connecting these dots for the first time. The company confirms that Google Meet’s and Pixel 10’s S2ST systems share the same training data and architecture.
“The new end-to-end S2ST technology has been launched in two key areas,” Google said, and “it is now available in Google Meet on servers, and as a built-in on-device feature for the new Pixel 10 devices.”
Google adds that Google Meet and Pixel “utilize different strategies for running the S2ST pipeline, they share training data and model architecture.” For Pixel’s Voice Translate, the company notes that it “also employs a cascade approach to maximize language coverage.”
Google Meet S2ST currently supports five Latin-based language pairs (English </> Spanish, German, French, Italian, Portuguese), but the company reports “promising capabilities” in Hindi, with plans for further expansion.
2025 Slator Pro Guide: Translation AI
The 2025 Slator Pro Guide Translation AI presents 15 impactful ways that AI can be used to enhance translation workflows.
Google also highlighted ongoing work on reducing literal, word-for-word output by improving the model’s lookahead capabilities for languages with significantly different word order from English.
“We believe that this breakthrough in S2ST technology will revolutionize real-time, cross-language communication, turning a long-envisioned concept into reality,” Google concluded.
In an accompanying explainer, the team described how a cross-Google effort — spanning Pixel, Cloud, Chrome, and DeepMind — accelerated development far faster than expected. “When we started, we thought, ‘Maybe this will take five years,’ Two years later, here we are,” they said.