As of Apple’s annual online conference, WWDC 2023, BERT is now making its way into the (Apple) developer mainstream. Bidirectional Encoder Representations from Transformers, in short, BERT, was open-sourced by Google for NLP pre-training in late 2018.
Fast forward to Apple’s June 7, 2023, WWDC session, where the iPhone maker featured BERT as the key to creating new multilingual models in its Create ML app / framework. Create ML is a tool for training models for a variety of machine learning tasks in areas like image, sound, or activity but also tasks involving text such as text classification and word tagging.
Apple reminded developers that transformer-based contextual embeddings are trained on large amounts of text using a masked model style of training, in which the model is prompted to suggest a missing word in a sentence. The multi-headed self-attention mechanism behind Transformers allows models to train on large amounts of textual data — including multilingual data.

