As part of this effort, Microsoft is also partnering with Hugging Face to host and distribute the resulting data. Additionally, it is funding work with Common Crawl, an open repository of web data, to enable native speakers to annotate and seed European language data into their publicly available datasets with authentic linguistic nuances.
Grants for Data and Academic Partnerships
Both MOIC and the AI for Good Lab are “also issuing a call for proposals to help expand the supply of digital content for 10 European languages,” focusing on languages with low online representation, such as Estonian, Alsatian, Slovak, Greek, and Maltese.
Slator 2025 AI Dubbing Report
The 85-page report analyzes the supply and demand for AI dubbing and the technical and operational nuances in delivering AI dubbing across verticals.
Chosen proposals will receive grants for Azure credits worth up to USD 1m and essential engineering and technical support. Applications will open on September 1, 2025, on the AI for Good Lab website. Both MOIC and the lab will prioritize projects that aim to make data available for languages currently underrepresented online.
Microsoft also plans on continuing to back initiatives like those at the Barcelona Supercomputing Center, the Basque Center for Language Technology, and the University of Santiago de Compostela, related to AI models trained in Spanish, Catalan, Basque, and Galician.
Additionally, the company is forging new partnerships with the University of Strasbourg and IE University School of Science & Technology in Spain, providing Azure grants for joint research that focuses on low-resource languages.
While the announcement does not expressly mention a direct benefit to Microsoft in terms of new intellectual property or data from this initiative, in the European Digital Commitments announcement from April, the company did state they would “start with an expansion of our cloud and AI infrastructure in Europe,” signaling an alignment with business goals.