Machine translation has significantly evolved over time, especially in terms of accuracy levels in its output.

According to the below image released by Google in 2016, Google Translate performs translations in varying levels of accuracy on par with human translators, from Spanish, Chinese and French to English and vice versa.

Google Translate Accuracy

A graphic from Google Research highlighting the 2016 accuracy levels of Google Translate

As machine translation applications are reaching significantly high accuracy levels, they are being increasingly employed in more areas of business, introducing new applications and improved machine-learning models.

In this article we set out to explore a wide array of B2B and consumer applications for machine translation, and we aim to answer the following questions for our readers:

  • What is currently possible with today’s machine translation technology?
  • What kinds of translation applications can be used in many contexts, and which applications are industry specific (in law, finance, and more)?
  • Based on progress thus far, what seem to be the most promising use-cases for machine translation in business today?

Before exploring B2B and consumer translation technology, let us have a brief look at the the insights summing up the machine translation applications and their use cases.

Machine Learning for Language Translation Applications – Insights Up Front

From our research, the machine translation applications seem to fall into two major categories, depending on the target audience they serve.  

Machine Translation in Industry for Business Use

Although big players like Google Translate and Microsoft Translator offer near-accurate, real-time translations, some “domains” or industries call for highly-specific training data related to the particular domain in order to improve accuracy and relevancy. Here, generic translators would not be of much help as their machine-learning models are trained on generic data.

These applications are used by small, medium and large enterprises. Some organizations offer multi-domain translation services–that is, customizable solutions across multiple domains–and other organizations offer translation solutions only for a specific domain. These solutions, although automated for the most part, still depend on human translators for pre- and post-editing processes. Some fields that warrant domain-specific machine translation solutions are:

  • Government
  • Software & technology
  • Military & defence
  • Healthcare
  • Finance
  • Legal
  • E-discovery
  • Ecommerce

Online / App Machine Translation for Consumer Use

These machine learning applications perform instant translation for textual, audio, and image files (images of words on screens, papers, signboards, etc.) from a source language into a target language. These are typically generic–that is, non-domain specific–albeit with a high translation accuracy.

These applications are usually light-weight, cloud-based apps or wearable devices that are typically trained on crowd-sourced data. These are mostly used by individual consumers, such as travelers, students, etc. Real-time translation applications most commonly offer:

  • Text-to-text
  • Text-to-speech
  • Speech-to-text
  • Speech-to-speech
  • Image (of words)-to-text

Let us first delve into how machine translation is employed across global businesses, followed by its consumer applications.

B2B Domain-Specific (Industry-Specific) Machine Translation

Over the years, multiple organizations have cropped up with intelligent services and products offering domain-specific machine translation. General translators, that are usually used in consumer applications, such as Google Translate, may have difficulty translating data related to specific business domains, which have their own nuances and terminologies.

These business domains are vast and include industries like technology, software, automotive, e-discovery, legal, financial, military & defence, healthcare, e-learning, ecommerce institutions, and so on.

A machine translation system can be adapted to a specific domain by using training data from the same domain. For example, in order to adapt an MT system for the legal domain, training data including the most commonly used contextual terms, keywords, phrases, terminology, etc., in the legal domain are compiled into corpora, which act as an exhaustive data repository for the MT system to refer to and train on.

Based on the system’s usage and the models the system is built on, over time, the machine-learning algorithms better adapt themselves to providing more accurate translation results by refining and adding to the domain-specific corpora. More often than not, MT models are used in conjunction with human translators for perfecting the corpora and for post editing processes.

Some corporations offer machine translation services in multiple domains, whereas others offer specialized services in a singular domain.

Multi-Domain Machine Translation Services

KantanMT

Dublin-based KantanMT offers “a SaaS-based Machine Translation platform that enables users to develop and manage custom machine translation engines in the cloud.” According the company’s website, the platform can be customized to offer machine translation services across eight domains, including e-retail, government, travel, etc.

The 3-min demo video below explains how the KantanMT platform works. The platform is typically integrated with a business’s IT infrastructure. Once integrated, this platform allows for the creation of client profiles by enabling the business user to add domain-specific training and terminology data, which are derived from the business’s corpora/data repository.

KantanMT claims that its MT engines offer over 760 language combinations.

SYSTRAN

SYSTRAN offers a variety of translation services for five domains/industries. The SYSTRAN Enterprise Server, which is deployed on the business server of the client, offers three modes of translation:

  • Full-text translation (translates plain text),
  • File translation (translates files like Powerpoint, Excel, Word, etc., without altering their formats), and
  • Web translation services (translates entire web pages).

The product video below gives a brief idea about how these three modes work:

SYSTRAN products run on the company’s own neural network engine, which it calls as Pure Neural Machine Translation. It is used in an open source community (openNMT), where the research results of application of artificial neural networks to natural language processing are shared.

Specialized Machine Translation Services

SDL Government

According to the below interview by SDL Government’s CEO, Danny Rajan, the company primarily serves the US government, focusing on defense and intel use cases.

One such use case, Mr. Rajan explains, is the real-time translation of social feed in Russian or Arabic to English, using the company’s translation system, which then could provide information or “actionable insights” to the government on whether the individual who posted the feed could potentially pose a national threat.

He also adds that SDL Government’s technology can translate social media feed, documents, audio, images in multilingual content, in real time.

Canopy Innovations – Healthcare

Canopy Speak is a medical translator app, which runs on a phrase library, which the company claims as the largest corpus of pre-translated medical phrases, organized by frequently encountered procedures and medical specialties.

The video below demonstrates how the app can be useful in medical situations involving non-English speakers. This video takes an example of a doctor in emergency medicine encountering a Spanish patient in pain. It goes on to explain how the doctor can quickly navigate or search through the clinically-organized categories to ask simple and relevant yes or no questions to the patient through text-to-speech translation of English-Spanish phrases that come with built-in with the app.

This translator app is available in text and audio in 15 languages. It offers only a one-way communication, however, prompting the medical examiner to call a human interpreter in order to translate the patient’s responses or in case of two-way communications involving more medical details.

Lingua Custodia – Finance

Lingua Custodia specializes in machine translation of financial documents, such as key investor information documents (KIIDs), fund prospectus, fund annual reports, fund factsheets and portfolio management commentaries, etc.

The company’s key translation tool VERTO–the company claims–is a fully customizable tool with the specific ability to “learn” from the texts previously translated by its users.

The below demo video presents how VERTO customized translation engine is used to translate a KIID PDF from French to English, using a translator engine which is dedicated to translating only KIIDs and the client company’s database.

Lingua Custodia’s website does not clearly list or mention who its customers are, except that “it meet the needs of professionals requiring the translation of financial documents.”

Consumer Machine Translation Applications

Google Translate

Among the B2C machine translation applications, it is common knowledge that Google Translate is the biggest player. Its real-time translation capabilities now include text, speech, and image (of words), all packaged into a single platform in the form of a mobile app and cloud service.

Google first announced the launch of Google Translate over a decade ago. Since then, it has consistently advanced to include speech recognition and image recognition technologies for translation. Most recently, the company launched the Google Neural Machine Translation system (GNMT).

According to the research document linked above, GNMT uses the neural machine translation technology, which considers the entire input sentence as a single unit for translation, as opposed to breaking the sentence into words and phrases.

The below GIF model depicts how the GNMT works in translating Chinese to English. The research document reads, “First, the network encodes the Chinese words as a list of vectors, where each vector represents the meaning of all words read so far (“Encoder”).

Once the entire sentence is read, the decoder begins, generating the English sentence one word at a time (“Decoder”). You can find more information on how the technology works in the detailed research published by Google here.

A simple visualization of how Google’s translation “Decoder” works

In 2014, Google also bought Quest Visual, whose signature product Word Lens, a camera-based translation app to augment its image translation (of printed and hand-written words) capabilities.

Real-Time Text-to-Text Translation using NMT

Facebook Translate

Facebook has been experimenting with machine translation for close to a decade. It launched the Translations app back in 2008, which used basic machine learning methods and crowd-sourced translations from its users to study and implement machine translation for its entire website. The Translations app has now evolved into a more sophisticated and intelligent Facebook Translate app that employs NMT.

In 2011, Facebook first introduced a new inline translation tool for its comments section, powered by Bing. The original announcement from Facebook about launching this tool using Bing seems to have been removed by Facebook. However, there are several third-party media sites, including Mashable, which picked up on this news back in 2011, citing the same. You can find additional third-party sources here, here, and here.

From these sources, we gather that Facebook was working with Bing to expand its translation capabilities to comments on its news feed for over six years and 44 languages, until it launched its own AI-powered translation system in May 2016.

The below video features Alan Packer, Facebook’s director of engineering for language technology, who revealed this progress at the 2016 MIT Technology Review’s EmTech Digital.

An early August news release from Facebook this year declared that it is entirely transitioning into NMT from phrase-based model in order to provide more accurate translation results for its two billion users. The article also claims that NMT provides an 11% increase in translation accuracy across all languages when compared with the phrase-based systems.

Inaccurate translation from Turkish to English in real time using phrase-based models:

Facebook Translate before NMT

More accurate translation from Turkish to English in real time using NMT:

Machine Translation - 14 Current Applications and Services across 9 Use Cases 2

The article also mentions that the Facebook’s backend translation systems, now running on NMT, account for more than 2,000 translation directions and 4.5 billion translations each day.

Real-Time Speech-to-Speech Translation

Skype Translate – Powered by Microsoft Cloud Translator Services

In December 2014, Skype officially introduced a voice translation feature in their software that initially used a combination of speech recognition and statistical machine translation to instantly translate conversational speech in real-time between English and Spanish. The company claims that the training data for speech recognition and machine translation comes from a variety of sources, such as translated web pages, videos with captions, previously translated and transcribed one-on-one conversations, including data donated by organisations.

As of 2017, Skype uses the Microsoft Translate technology, which runs on neural networks, to more accurately translate audio conversations in 18 languages and textual conversations in 60+ languages.

The speech-to-speech translation is achieved in four steps. First, using Automated Speech Recognition,  the system converts the source audio to raw text in the language it is spoken in. Then, it normalizes the text for conversational speech to make it more appropriate for translation. After normalization, the system then translates the source-language text to the intended target language using a text translation engine built on translation models specially developed for real-life spoken conversations. Finally, the system performs a text-to-speech translation to produce the translated audio.

Microsoft - How does speech translation work?

Microsoft’s visualization of machine translation in action – source

Microsoft has launched several independent projects to showcase the potential of Skype Translate across the globe. The video below demonstrates one such project where an aurally challenged American child who uses hearing aids and cochlear implants explores Skype Translate to converse with children speaking in Mandarin.

Although these independent project videos seemingly show a successful demonstration of Skype’s speech-to-speech translation in real-time, it still remains to be seen whether Skype Translate could become scalable enough to use widely in online classrooms and business conversations between organizations.

The biggest drawback here, in order to improve the accuracy of speech-to-speech translation and to monitor the unique nuances of languages, Skype automatically records voice calls (after issuing an appropriate warning to the user) while the translate feature is being used. This poses a great security threat to organizations.

The ili – “Wearable” Voice Translator Device

The ili is a small, handheld voice translator, built to translate simple, common phrases for travelers. Although it is marketed as a wearable, this Tokyo-based technology uses only a simple chain for a traveler to wear around their neck with the device as the pendant. It works offline and does not require the Internet to instantly translate speech.

It is currently only available in four languages (including English, Spanish, Mandarin, and Japanese) and does not support a two-way conversation. The ili’s website defends its preference of one-way translation by pointing out that travelers and tourists typically ask simple phrases that warrant yes or no answers. The ili has a built-in phrase-based translation engine, which can be updated by plugging the device into a USB port and connecting to the website.

The ili initially launched a limited edition of only 3000 units, and according to its website, all 3000 units have been sold so far. Despite its seemingly several shortcomings, The ili has promised to sell more units in the near future.

The ili’s YouTube videos, such as the one below, demonstrate its successful use. However, the website or the company’s news sources provide no discernible evidence for the device’s level of translation accuracy.

Pilot – Wearable Earbud for Speech-to-Speech Translation

Pilot is a yet another wearable device manufactured by Waverly Labs. It is a pair of wireless earbuds for facilitating a two-way conversation between speakers of two different languages. It is a product of a well-funded IndiGogo campaign that managed to raise above $4.5M from about 22,000 backers.

The earbuds are paired with the Pilot app, which recognizes the speech, translates the source language to the target language using machine translation, and synthesizes the translated response. The website and the campaign page do not go into the details of the technology. They also make no false promises when it comes to translation accuracy. Instead, they claim that the accuracy gets better over time–as with most machine translation software.

Below is a short live demo by the CEO and cofounder of Waverly Labs, Andrew Ochoa. The host in the video tries a live conversation with Mr. Ochoa using the Pilot earpieces and the app.

It is evident from the video that the product is far from perfect. However, considering the fact that speech-to-speech translation wearables are a new technology and the claim by the company that the accuracy level would only increase with time, it remains to be seen whether this product will live up to its promise.

Pilot currently supports four Romance languages and English. It claims that about 25,000 units have been sold to date, and is receiving preorders for more in the future.

iTranslate Voice 2 – Speech-to-Speech Translation App

iTranslate Voice 2 by iTranslate is a speech-to-speech translation app that supports 42 languages. It claims to use sophisticated voice recognition and machine translation software to recognize speech in the source language and translate it real-time to the target language.

This app is marketed as a traveler’s translation companion and has a new feature called Phrasebook, which allows for a quick access to frequently used phrases. iTranslate has also released an app for Apple watch called iTranslate Converse.

There are many other speech-to-speech translation apps for travelers and tourists, such as SayHi, Speak and Translate, Papago, TripLingo, and so on, that come with varying numbers of languages, which run on similar machine translation technology.

Real-Time Image-to-Text Translation

Textgrabber 6 – Real-Time OCR Capture and Translation

Textgrabber is an app from ABBYY, which digitizes image from any surface and translates the text on the image in almost real-time using the optical character recognition (OCR) technology. It allows the user to take a snapshot of the image and select the text from the image.

The app then isolates and digitizes the text and translates it in real-time to the intended target language. It was first launched in 2013 and seems to have evolved over time, with the addition of languages and new actions such as the ability to email translated text to a recipient, etc.

The company’s website and app download pages from App Store and iTunes claim that it supports over a hundred languages for full-text translation. However, the inner workings of the technology and the accuracy level of translation are not mentioned in any of the related pages.

Future Challenges of Machine Translation

A report on natural language processing (NLP) by Tractica, a Colorado market intelligence firm that focuses on human interaction with technology, forecasts that the market size of the NLP industry (of which machine translation is a part) will be about $2.1 billion by 2024.

A defining success factor of machine translation applications is their accuracy level. With the advent of NMT, machine translation applications seem to be improving their accuracy level over time. They “learn” user behavior, language patterns, and nuances that are both language-specific and domain-specific, from the data they are trained on, which is a continuous process.

However, they are still bound to make mistakes as machine translation is a fairly recent and still-evolving technology.

While mistranslations could be a source of hilarity, in some situations it could bring about a life or death scenario. A recent NewsWeek article reports of an unfortunate incident in which an unassuming citizen of Israel was captured without warning, all because Facebook mistranslated his “Good Morning” in Arabic to “Attack them” in English. It did not help that he was leaning against a bulldozer. He was allegedly released after the police questioned him for several hours.

According to the news source, Necip Fazil Ayan, an engineering manager in Facebook’s language technologies group, provided the following statement:

“Unfortunately, our translation systems made an error last week that misinterpreted what this individual posted. Even though our translations are getting better each day, mistakes like these might happen from time to time and we’ve taken steps to address this particular issue. We apologize to him and his family for the mistake and the disruption this caused.”

Incidents such as this one also raises serious questions in the business context: How much would a mistranslation-related incident cost a business? How safe is it to use a third-party machine translation technology, especially in businesses requiring to translate confidential data? It remains to be seen how these challenges will be solved by technology in the near future.

 

Header image credit: Netherlands Foreign Investment Agency