Research Attempts to Bring Literary Machine Translation Closer to Human Quality

While Dr. B.J. Woodstein, Professor, Translator, and Writer, emphasized on a SlatorPod in April 2024 the crucial role of human intervention in literary translation — particularly in handling linguistic nuances and cultural concepts that surpass the capabilities of AI or machine translation (MT) — researchers from Aalborg University and the University of Groningen set out to improve literary MT to better preserve the stylistic and creative elements of the original text.

In their research paper published on August 30, 2024, the researchers explained that lexical diversity — the range of unique words used in a text — is important in literature “where it matters not only what is written, but also how it is written.”

However, they noted that MT systems often produce translations that are “lexically poorer” than those generated by human translators, leading to a loss of stylistic nuance.

To address this issue, they proposed an approach to recover the “lost” lexical diversity in MT through a tailored re-ranking of translation candidates. Rather than applying a rigid increase in lexical diversity across all texts, their approach adapts the recovery process to align with the diversity of the original work.

Model-Agnostic

The process begins with the MT system generating multiple translation hypotheses for a given source text. A classifier then accesses these hypotheses, estimating the likelihood that each one resembles an original text in the target language. Additionally, each original text is assigned a lexical diversity score that reflects its vocabulary richness, which is factored into the re-ranking process.

2024 Cover Slator Pro Guide Translation AI

2024 Slator Pro Guide: Translation AI

The 2024 Slator Pro Guide presents 20 new and impactful ways that LLMs can be used to enhance translation workflows.

$365 BUY NOW Included in our Pro and Enterprise plan.
Subscribe now!

Translation hypotheses are sorted based on their probabilities of being original texts, with the final selection influenced by the original text’s lexical diversity score. This means that a translation hypothesis with a high probability may be bypassed if it does not match the desired lexical richness.

The output is a translation hypothesis that best balances the likelihood of being original with the lexical diversity score, ensuring the translation conveys meaning while reflecting the original’s stylistic richness.

The researchers emphasized that their approach is “model-agnostic.” As long as the MT system can generate multiple translation candidates for a given text, the re-ranking method can be applied to improve the selection of the best translation.

Closer to Human Quality

To evaluate the effectiveness of this approach, the researchers tested it on 31 English-to-Dutch book translations, employing various metrics, including BLEU and COMET scores for translation accuracy, and lexical diversity scores to assess vocabulary richness.

They compared the tailored re-ranking approach to both vanilla MT and human translations and found that the tailored re-ranking method produced translations with lexical diversity closer to that of human translations.

Authors: Esther Ploeger, Huiyuan Lai, Rik van Noord, and Antonio Toral

Featured