The TEC model uses the same structural foundation as Automatic Post-Editing (APE), which has been widely studied but is different from TEC in many ways. For example, TEC uses errors made by humans as data and focuses on error correction instead of detection. TEC can also discern content that does not need editing.
In contrast to TEC, APE is “dominated by the fluency errors that are characteristic of MT systems (74% of sentences),” the paper stated, adding that the “TEC corpus exhibits a broader distribution of errors that human translators are prone to make.”
Asked to define translation fluency, the scientists replied, “Fluency of a translation describes whether a native speaker of the language would use the phrasing, structure, and word choice that appears in the translation.”
Scientists used a bilingual corpus called ACED, which contains three datasets from different domains. The data consists of 35,261 English–German translations performed and edited by professional translators (not post-edited).
To prepare the data, the scientists eliminated duplicate source sentences, removed translations rewritten by reviewers, and classified errors into three main categories: monolingual edits found in the target text, bilingual edits that correct translation errors, and preferential edits.
The ACED data was pre-processed, pre-trained, and fine-tuned using actual human corrections. Tests were conducted and comparisons made between TEC and other models including MT, GEC (grammatical error correction), and BERT-APE.
Nine professional translators participated in the study as reviewers to determine the real-world applicability of the model. The nine were asked to review sentences (of which 255 had suggestions for corrections) and provide qualitative observations.
A tenth professional translator was tasked with reviewing the reference translations in the dataset and ranking the quality of the sentences reviewed by the other nine.
Slator Software-as-a-Service (SaaS) Localization Report
90-page report on how SaaS companies partner with LSPs to localize products, drive market/user growth. Incl. market size, tech review.
Next Step in Translation Workflow Automation?
Comparisons to other models highlighted significant differences in how the TEC model ultimately performed. For example,
- the professional reviewers accepted 79% percent of the TEC suggestions for correction;
- reviewers spent less time reviewing when suggestions were accepted; and
- domain adaptation proved critical to performance — and customization, essential to translation error correction.
Five of the nine reviewers emphasized the need for reliability. In the test, some suggestions were incorrect or the system did not reliably make an applicable edit.
Three reviewers found the TEC system “could be a memory aid or substitute for researching client-specific requirements.”
Three reviewers commented that TEC could help “by making them aware of what errors they might look out for, especially in repetitive content where it may be easy to miss details.”
Given the findings, TEC could be the next step in translation workflow automation. As the model’s precision increases, the greater its potential to make a practical difference during the review stages of translation production.