Google Translate will be upgraded using a "Neural Machine Translation" technique, starting with Chinese-English translation today:
Google has been working on a machine learning translation technique for years, and today is its official debut. The Google Neural Machine Translation [GNMT] system, deployed today for Chinese-English queries, is a step up in complexity from existing methods. Here's how things have evolved (in a nutshell). [...] GNMT is the latest and by far the most effective to successfully leverage machine learning in translation. It looks at the sentence as a whole, while keeping in mind, so to speak, the smaller pieces like words and phrases. It's much like the way we look at an image as a whole while being aware of individual pieces — and that's not a coincidence. Neural networks have been trained to identify images and objects in ways imitative of human perception, and there's more than a passing resemblance between finding the gestalt of an image and that of a sentence.
Interestingly, there's little in there actually specific to language: The system doesn't know the difference between the future perfect and future continuous, and it doesn't break up words based on their etymologies. It's all math and stats, no humanity. Reducing translation to a mechanical task is admirable, but in a way chilling — though admittedly, in this case, little but a mechanical translation is called for, and artifice and interpretation are superfluous.
The code runs on Google's homegrown TPUs. The Google Research Blog says that the technique will be applied to other language pairs in the coming months.
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Related Stories
In 2023, AI researchers at Meta interviewed 34 native Spanish and Mandarin speakers who lived in the US but didn't speak English. The goal was to find out what people who constantly rely on translation in their day-to-day activities expect from an AI translation tool. What those participants wanted was basically a Star Trek universal translator or the Babel Fish from the Hitchhiker's Guide to the Galaxy: an AI that could not only translate speech to speech in real time across multiple languages, but also preserve their voice, tone, mannerisms, and emotions. So, Meta assembled a team of over 50 people and got busy building it.
[...] AI translation systems today are mostly focused on text, because huge amounts of text are available in a wide range of languages thanks to digitization and the Internet.
[...] AI translators we have today support an impressive number of languages in text, but things are complicated when it comes to translating speech.
[...] A few systems that can translate speech-to-speech directly do exist, but in most cases they only translate into English and not in the opposite way.
[...] to pull off the Star Trek universal translator thing Meta's interviewees dreamt about, the Seamless team started with sorting out the data scarcity problem.
[...] Warren Weaver, a mathematician and pioneer of machine translation, argued in 1949 that there might be a yet undiscovered universal language working as a common base of human communication.
[...] Machines do not understand words as humans do. To make sense of them, they need to first turn them into sequences of numbers that represent their meaning.
[...] When you vectorize aligned text in two languages like those European Parliament proceedings, you end up with two separate vector spaces, and then you can run a neural net to learn how those two spaces map onto each other.
But the Meta team didn't have those nicely aligned texts for all the languages they wanted to cover. So, they vectorized all texts in all languages as if they were just a single language and dumped them into one embedding space called SONAR (Sentence-level Multimodal and Language-Agnostic Representations).
[...] The team just used huge amounts of raw data—no fancy human labeling, no human-aligned translations. And then, the data mining magic happened.
Google has lifted the lid off of an internal project to create custom application-specific integrated circuits (ASICs) for machine learning tasks. The result is what they are calling a "TPU":
[We] started a stealthy project at Google several years ago to see what we could accomplish with our own custom accelerators for machine learning applications. The result is called a Tensor Processing Unit (TPU), a custom ASIC we built specifically for machine learning — and tailored for TensorFlow. We've been running TPUs inside our data centers for more than a year, and have found them to deliver an order of magnitude better-optimized performance per watt for machine learning. This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore's Law). [...] TPU is an example of how fast we turn research into practice — from first tested silicon, the team
had them up and running applications at speed in our data centers within 22 days.
The processors are already being used to improve search and Street View, and were used to power AlphaGo during its matches against Go champion Lee Sedol. More details can be found at Next Platform, Tom's Hardware, and AnandTech.
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @10:24AM
China produces everything and America consumes everything and Google just took the jerbs of the people who been translating Chinese labels into English labels for American idiot consumers!
(Score: 1) by Francis on Thursday September 29 2016, @01:34PM
Mostly Chinese people and if you're lucky there's been a foreigner checking it afterwards, but usually not.
But, only an idiot uses Google translate for any sort of serious translation work. At best, it's a quick look. For best results, you have to use the word order for the foreign language in order to nudge the translation into something approximating correct. Chinese is a particular problem and one where the current system fails worse than probably any other language, which is probably why it's the first to go. It's basically unusable now, but if the new one works, that would give them a ton of data to use when figuring out what to worry about with the other ones.
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @11:19AM
I’m not sure I want to know that either…
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @11:25AM
Apply now to be a White House intern and you will have two possible futures. You will give head in both futures, but will it be cock or clam?
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @11:54AM
You forgot the Reptilian future.
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @12:25PM
In this case the clam is a sub-set of Reptilian.
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @02:00PM
How about machine translating this?
(Score: 1) by Francis on Thursday September 29 2016, @01:37PM
Unless you're an English teacher or learned English as a secondary language, you probably wouldn't know the difference explicitly, but you almost certainly know the difference implicitly.
Perfect just means that the action has been completed and continuous means that it's still in progress. I'm not sure why we use the terms perfect and continuous when we use the terms perfect and imperfect for the same basic concept when dealing with other languages.
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @02:39PM
Okay, I follow the words, but I'm still struggling to understand. Could you please provide examples? From the use of the word 'action', I'm guessing this has something to do with verb tense. If so, it would help to see something like this:
Similarly, "will lift", "lift", and "lifted."
Do those encompass the concepts? If so, please identify the "perfect" and the "continuous" (and whatever the third one is) — and if not, then better examples would be much appreciated!
(Score: 2) by schad on Thursday September 29 2016, @05:22PM
For the past ten minutes, I have been kicking the ball.
(Score: 2) by HiThere on Thursday September 29 2016, @06:44PM
15 minutes from now I will have been kicking the ball.
I hope that tomorrow I will look back on kicking the ball with pleasure.
15 minutes from now I ought to have been kicking the ball.
etc.
Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
(Score: 3, Funny) by wonkey_monkey on Thursday September 29 2016, @11:57AM
It make very good change word.
Thanks chief!
systemd is Roko's Basilisk
(Score: 2) by gringer on Thursday September 29 2016, @12:18PM
I really like this translation service. I am looking forward to many messages and inspiring spam in the future.
Ask me about Sequencing DNA [youtube.com]
(Score: 1) by Francis on Thursday September 29 2016, @01:44PM
I think there should be an equivalent test to the Turing test for automatic translators. Basically, when the translations are able to beat that of a human translation.
That being said, I've been messing around in the new translator for the last couple minutes and the results seem to be much improved over the older version. Most of the things I'm typing in are correct, or at least in the ball park for what they should be. And I'm not having to do my customary Chinglish input to get something that's grammatically appropriate in Chinese.
I'm sure that as more people give corrections to the engine that they'll be able to handle more.
Admittedly, I'm just typing in relatively simple sentences that should have been right previously, but that's still a huge improvement. Chinese is notoriously difficult for machines to translate, especially simplified Chinese. There's a ton of characters that now do multiple things depending upon context that historically had different characters and the only way to know the difference is from context.
(Score: 2) by bob_super on Thursday September 29 2016, @05:32PM
Dear Sir,
I being only son of bullet behind head Chinese billionaire Shi YinPing, and need the assistance in transfer of the 253.12 MILLION JIAO AND 6 FEN.
You get the 15% for your invaluable help.
Please send details for the bank's the account so we may debute transfer.
Love
Shi Ske Bab.
(Score: 2) by KritonK on Thursday September 29 2016, @01:37PM
If i input
我隻氣墊船裝滿晒鱔
which I am assured [omniglot.com] means "my hovercraft is full of eels", I get:
I only hovercraft filled with eel.
It leaves a lot to be desired.
(Score: 3, Informative) by Francis on Thursday September 29 2016, @02:00PM
Looks like the original Chinese isn't right. The second character is wrong, it should be "我的氣墊船裝滿晒鱔" if you type that into the translator you get almost exactly the translation you would expect, and arguably completely correct.
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @02:56PM
That's a problem with machine translations. If one character is off a machine destroys the entire sentence. A human, like yourself, can tell it was a minor mistake and can still correctly translate the intent. So the machines still need improvement.
(Score: 2) by jdavidb on Thursday September 29 2016, @08:15PM
That's a problem with machine translations. If one character is off a machine destroys the entire sentence
On the plus side, that sounds like a very effective hashing algorithm!
ⓋⒶ☮✝🕊 Secession is the right of all sentient beings
(Score: 1) by Francis on Thursday September 29 2016, @09:13PM
Chinese is particularly problematic because they haven't discovered spaces between words. As a result word segmentation issues abound.
Most other languages have them. It's rather inefficient to have to mentally insert white space when reading and makes it hard to identify words.
Making things worse, the language has a ton of particles that aren't words, but are required for grammatical reasons.
(Score: 2) by darkfeline on Friday September 30 2016, @03:29AM
That's a false premise though. Not all human Chinese speakers would have caught that error, and many Chinese speakers are functionally illiterate.
Comparing the best of human performance against the worst of computer performance is hardly fair, and is denial at best, of the frightening potential of machine learning.
Join the SDF Public Access UNIX System today!
(Score: 2) by moondrake on Thursday September 29 2016, @02:14PM
Well..there is Chinese and there is Chinese. I am by no means fluent but would translate the sentence as google did, even although I end up with nonsense (as if the original made much sense...). Then I followed your link and saw you copied a Cantonese sentence... Often the writing has roughly the same meaning though (even although the talking is unintelligible to me, and probably to most native mandarin speakers). But it is not the same.
Google does not seem to translate Cantonese. But it does translate the mandarin Chinese correctly.
(Score: 2) by takyon on Thursday September 29 2016, @02:39PM
It would seem to count for both Simplified and Traditional, but it may not even work for the reverse English-Chinese pair yet. Another thing that could muddy the waters is that certain simple sentences seem to have been pre-checked by the "community":
https://translate.google.com/#en/zh-TW/This%20is%20a%20test. [google.com]
Click the green checkmark.
I'm surprised that Google isn't doing Cantonese. It looks like their strategy is to get others to do the hard work for them:
https://www.reddit.com/r/Cantonese/comments/2uwv54/google_has_listened_they_are_adding_cantonese_to/ [reddit.com]
https://productforums.google.com/forum/#!topic/translate/w3jPNgOQD8s [google.com]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 0) by Anonymous Coward on Thursday September 29 2016, @05:14PM
How else would you explain that they can harness a Neural network but to assume that the Borg have finally arrived and they are Google?
Sincerely,
Somebody who's involved in actual neuroscience IT, not virtual computer "neural" crap.