0
Table of contents

Machine translation engines like Google Translate or DeepL provide access to neural power for hundreds of millions of users worldwide. They are part of the daily routine of translating entertainment content, providing support during tourism, etc. This technology has already become the core of the translation industry as it allows a significant volume of information to be translated into relatively short terms.

Despite its benefits, the quality of machine translation results still needs improvement. Errors can occasionally occur, and the rarer the language pair, the more issues can arise.

When quality problems occur in companies with large volumes of data, where translators have to re-translate after machines and spend much more time, the question of how to automate and improve the process arises. Here, the MTQE—machine translation quality estimation—comes to the stage.

What is MTQE?

MTQE is an automated method for evaluating the quality of translation engines without directly comparing the output to a reference translation. This approach is distinct from MTQEVAL (Machine Translation Quality Evaluation), so explore both methods in more detail to enhance our understanding of the advantages of MTQE.

But first, let's overview the prominent cases where automated machine translation quality estimation is required and can significantly assist.

  • Real-time translation services or systems. This group includes systems that deliver automated translation results, such as applications that provide subtitles for live chats or videos or software for multilingual customer support.
  • Companies with a significant volume of content for translation where manual translation checking isn't possible or too expensive.
  • Systems that use CLIR (Cross-Language Information Retrieval), like international search systems, e-libraries, and marketplaces/trading platforms.

Now, let's check how MTQE approaches work and how they can cover the cases above.

MTQE vs. MTQEVAL

The MTQE methods are based on predicting the quality of the machine translation results. Under the hood of this system can be neural networks, statistical and machine learning methods, or hybrid solutions. The MTQE's main task is to evaluate how logically and accurately a translation matches the source text. This approach checks the meaning alignment between the source and the translated text and the accuracy of using key elements such as numbers, names, or terms and can be split into:

  • Evaluate the syntactic accuracy (checking whether the translation complies with the grammatical rules of the target language.)
  • Evaluate the semantic accuracy (checks how accurately words and phrases are translated in terms of meaning.)
  • Сhecking the usage of the terms.
  • Сhecking the entire context.

In simple words, the MTQE "understands" the meaning of the source text and the translated result and can then provide a rating based on numerical values ​​or categories (e.g., "excellent translation," "needs improvement," "poor quality).

MTQE pros:

  • Saved time on reference translation creation - the models can work without high-quality references, which reduces manual translation.
  • Raised speed - the MTQE can be implemented for real-time usage, which is vital for online translation systems.
  • Easy scalability - the increased number of content for translation wouldn't somehow affect the process as there is no need to prepare a reference translation.
  • Simplifies processing of the big data: MTQE helps quickly determine which translation needs revision and which can be already used. This logic allows companies with a vast content volume to focus on the text segments with low-quality rates instead of proofreading all the content.

MTQE cons:

  • Limited accuracy. Since MTQE does not use reference translation, it may be less accurate than the reference-based methods.
  • Training and interpretation challenges. MTQE models require high-quality training, and sometimes, it can be difficult to understand which aspects of the translation cause a high or low-quality score.

Conversely, the MTQEVAL always requires the reference translation as a base for comparison. For example, let's overview a few popular approaches - BLEU and METEOR and how they work.

BLEU (Bilingual Evaluation Understudy) compares the translated text with reference translation based on a list of criteria (analyzes sequences of words/characters, the length (brevity penalty)). The entire BLEU logic is built around "n-gram" - a sequence of n elements.

For instance, let's assume that we have:

  • Reference translation: "This is a blog about localization"
  • Translation: "This is the blog about localization."

The 1-grams are - "This" "is" "the" "blog" "about" "localization" (5 matches out of 6.)

The 2-grams are -"This is" "is the" "the blog" "blog about" "about localization" (4 matches out of 5.)

So, BLEU combines all the results of 1-gram, 2-grams, 3-grams, and 4-grams to provide the final assessment (e.g., if a translation contains 20 n-grams and 10 of them match the standard, the accuracy is 50%).

METEOR (Metric for Evaluation of Translation with Explicit Ordering) is an "improved BLEU" because it considers synonyms, roots, and paraphrases in addition to the n-gram. Thus, this approach is more sensitive to the meaning of the translation but still requires the reference translation for work.

MTQEVAL approaches pros.

  • High accuracy. The MTQEVAL methods typically provide a more accurate and objective review of translation quality, as they have an exact example of the translation needed.
  • Flexible metrics overview. Approaches like BLEU and METEOR have transparent criteria, making it easier to see all aspects and allowing for a more in-depth assessment.

MTQEVAL approach cons.

  • Expensive and time-consuming. As these methods require reference translation it means that users have to prepare these references and spend additional time and budgets on them.
  • High dependence on the quality of the reference translation. The issues in the reference text will lead to continuous multiplying them in the subsequent translations.
  • Not suitable for real-time processes. Besides preparing the reference translation the process is resource-intensive, especially for large volumes of text.

For convenience, take a look at the table below:

CriteriaMTQEMTQEVAL
Reference neededNoYes
PurposePredicting qualityComparing with an ideal standard
MetricsStatistical models, etc.BLEU, METEOR, etc
SpeedFastSlow
AccuracyMiddleHigh
Ways to useTranslation processes automationAccurate comparisons

Note: It is a widespread practice when both approaches are combined. Hence, the MTQE can help with a quick quality assessment, while MTQEVAL can be used for a more in-depth analysis.

To sum up

Getting perfect translation quality without compromises is the number one target for any business. At Lingohub, we understand the business needs and that implementing custom models that evaluate the quality can be unprofitable. That's why providing the perfect translation from the start is better than improving it afterward. To simplify the process for Lingohubbers, we offer the following:

  • Machine translation that combines the top-on-market engines - Google Translate, DeepL, and Amazon Translate for the perfect result;
  • Glossary and style guide that affect the machine translation results;
  • Translation memory, which allows reusing of the previous translations to reduce the repetitive tasks;
  • Post-editing score to evaluate the translators' efforts fair and transparent;
  • And many more.

Book a demo call with our team for more information, or try the 14-day free trial and check everything yourself.

Try lingohub 14 days for free. No credit card. No catch. Cancel anytime