How AI Is Challenging Traditional Translators
How AI Is Challenging Traditional Translators
Let's take a look at how Artificial Intelligence is changing traditional translators as well as explore how ML brings new opportunities for translation services.
Join the DZone community and get the full member experience.Join For Free
The most visionary programmers today dream of what a robot could do, just like their counterparts in 1976 dreamed of what personal computers could do. Read more on MistyRobotics.com and enter to win your own Misty.
In the last decade, translation services have grown exponentially to include hardware devices such as Travis Translator, earphones such as Waverly Labs' pilot, Microsoft Translator, — which not only translates text, but also speech, images, and street signs — Google translate, and Facebook translation. Translations are occurring faster and with greater accuracy thanks to machine translation.
But what does this mean for the traditional translator? As an expatriate in Germany, I am a user of both translation services and translation software, so I was interested to find out more. I spoke with the CEO and founder of Gengo, Matt Romaine. He co-founded Gengo in 2009 with the aim to democratize access to the opportunity for language enthusiasts around the world and become the bridge to mass global communication. Gengo offers a crowd-sourced human translation platform now with over 20,000 translators supporting 35+ languages. Their clients include Trip Advisor, Etsy, Salesforce, eBay, Facebook, and Google.
Machine Learning Brings New Opportunities for Translation Services
Advances in Machine Learning have not removed the need for translators but instead created a new economy and in response to the urgent need for rich, multilingual data to train AI systems. Earlier this year, the company launched Gengo.ai, an on-demand platform that provides developers of Machine Learning systems access to a wide array of multilingual services delivered by Gengo’s translators. The services available include sentiment analysis, content moderation, or any kind of content evaluation service such as entity extraction, search engine training, and chatbot training.
As Matt explained, "We started noticing there were other ways people could use our multisystem crowdsourcing services. For example, a client requested to have our translators evaluate their enunciation and pronunciation of the recordings they were generating. And as we dug further, the clients were developing a sort of Machine Learning system to either be able to respond to spoken commands like Siri or Google home, developing a sysyem that was able to speak in a more understandable intonation.
AI is like a little child. The type of education that the kid gets highly influences the outcome. Training data is very important to improve the quality of AI and Machine Learning system. We have this huge crowd of talented individuals and a whole new business opportunity akin to Amazon's Mechanical Turk where you can get large crowds of people to do simple tasks, but there's actually a spectrum of needing people of various levels of expertise including a language background where you need to understand the sentiment of a phrase in a different language. So that's how we're positioning ourselves where we can now use our crowd who originally were trained for translation and can apply for a machine learning and data applications."
The Need for Deeper Analysis
Gengo.ai is working on a range of projects that illustrate the demand for structured, quality data to enable an AI system to handle a wider range of tasks that require cultural and ethnic expertise. Matt shared the example of recording voice data to help AI get better at understanding immigrants (for a top car manufacturer) in preparation for the 2020 Olympics in Tokyo.
"Basically, there's a car navigation system manufacturer, and they have decided they want to build a system that can understand non-native Japanese speech. So what they needed was hours and hours of recordings in Japanese but spoken by slight intonation. And so we were one of the only platforms that already had the demographic to do that. We were able to gather that data and sort of expand the Japanese language acknowledgments so they worked with non-native speakers." Gengo was able to create an audio dataset consisting of hundreds of voice recordings of non-native Japanese speakers.
The company is also involved in activities such as sourcing eye movements from different ethnicities (from the transolator pool) to help research autism and collecting samples of handwritten Japanese characters by native speakers to train an OCR engine to read handwritten documents
How Is AI Advancing?
I was curious about what Matt thought about the progress of AI, explaining an amusing article I'd read about prototyping AI with human beings, which detailed documented examples of humans doing work that was marketed as being delivered by AI. Matt noted:
"We're not where everyone's imagination thought we would be. The type of work that we're doing has graduated to be extremely specific. Instead of just identifying images or language we're involved in determining intonation and sentiment analysis within specific platforms. And what's interesting is if I tie it back to the translation space, people in the media often talk about how machine translation would put all of the translators out a job. But we're actually more likely to see fewer individuals but those employed doing more complex work.
I was curious to know what Matt thought about translation devices. He noted:
"I think they are fantastic. I mean they're breaking down the language barriers that they're making communication more fluid and improving the human interaction for the most part. Language is kind of a living like a living organism. So there's always new words and across different generations. There's new terminology. So they will always need to be you know those devices take the human input and are trained to be able to do the translation that they do. So there may be fewer people is required to develop a device like that but there will always need to be someone may be of a younger demographic that can explain what your term or new word for the machine to be able to translate it into use."
Since the launch of Gengo.ai, the company has processed more than one billion words to date. The company offers a range of free resources to AI developers such as the 50 best free datasets for Machine Learning. AI developers can order AI training data by sending a preexisting file to Gengo’s personal account managers for review or they can use the Gengo.ai API to access high-volume data.
Opinions expressed by DZone contributors are their own.