One MT provider that beefed up its access to computational power is DeepL. The Germany-based company established a data center in Iceland to help source its computing power, claiming its supercomputer is among the largest in the world.
The large-scale training and development of MT (as well as NLP models more broadly), “could have detrimental consequences on the environment,” whether or not their energy usage is carbon neutral. In short, the energy used to train MT engines is “possibly contributing directly or indirectly to the effects of climate change,” the researchers said.
Slator Machine Translation Expert-in-the-Loop Report
60-page report on the interaction between human experts and AI in translation production, including AI-enabled workflows, adoption rates, postediting, pricing models.
Carbon Emissions per Language Pair
The researchers’ work involved evaluating six language pairs to assess the computational power required for training; that is, which pairs were more power-hungry and, hence, carbon-emitting.
By assessing the differences in carbon emissions per language pair, the researchers hoped to open the door to a more environmentally-friendly approach to MT training that takes into account the specific way a language pair performs.
The experiments focused on English, German, and French and their six possible language combinations. They compared the performance across two models: a convolutional sequence-to-sequence learning model (ConvSeq) and a transformer-based model with attention mechanisms, and used a dataset that contained around 30,000 samples for each language.
The researchers tracked the carbon emissions released during training using the CodeCarbon package as well as the improvement in BLEU scores for reference and comparison.
Environmentally (Un)friendly MT
Not only did the German target language pairs display the lowest BLEU scores, they also took the longest to achieve a BLEU threshold score of 25. The researchers said this second finding supported the hypothesis that “translation to German might be more computationally involved than French or English.”
In terms of training time required, the French>German, English>German, and German>French language pairs took the longest to train and were the most carbon-intensive pairs as a result. The French>German language pair was “the most computationally expensive” across both models.
By contrast, English>French, German>English, and French>English, which each involved English as a source or target language, took less time to train and were the least carbon-intensive.
Slator 2022 Language Industry Market Report
100-page flagship report on market size, buyer-segments, competitive landscape, sales and marketing insights, language tech and more.
Interestingly, the German dataset was the most lexically diverse of the three — based on vocabulary per number of tokens. This “likely demonstrates that lexical diversity is directly proportional to training time to achieve an adequate level of performance,” the researchers noted.
When comparing the two systems, the Transformer models proved to be significantly less carbon-emitting than ConvSeq models, which the researchers attributed to the fact that the former had comparably fewer parameters. Transformers also achieved higher BLEU scores.
The researchers concluded that a disparity exists between language pairs in terms of carbon emissions and “language pairs involving English demonstrate higher performance than ones that do not.” However, “much study remains to be done to identify what exactly it is that causes the differences in emissions,” they said.
Aside from proposing ways “to reduce carbon emissions released while training and deploying machine translation systems that are trained extensively over large datasets,” the researchers said future research could also be extended to low-resource languages and those that do not follow the Latin script.