Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 15 submissions in the queue.

Submission Preview

Link to Story

Meta takes us a step closer to Star Trek’s universal translator

Accepted submission by Freeman at 2025-01-15 22:26:35 from the darmok dept.
News

https://arstechnica.com/science/2025/01/meta-takes-us-a-step-closer-to-star-treks-universal-translator/ [arstechnica.com]

In 2023, AI researchers at Meta interviewed 34 native Spanish and Mandarin speakers who lived in the US but didn’t speak English. The goal was to find out what people who constantly rely on translation in their day-to-day activities expect from an AI translation tool. What those participants wanted was basically a Star Trek universal translator or the Babel Fish from the Hitchhiker’s Guide to the Galaxy: an AI that could not only translate speech to speech in real time across multiple languages, but also preserve their voice, tone, mannerisms, and emotions. So, Meta assembled a team of over 50 people and got busy building it.
[...]
AI translation systems today are mostly focused on text, because huge amounts of text are available in a wide range of languages thanks to digitization and the Internet.
[...]
AI translators we have today support an impressive number of languages in text, but things are complicated when it comes to translating speech.
[...]
A few systems that can translate speech-to-speech directly do exist, but in most cases they only translate into English and not in the opposite way.
[...]
to pull off the Star Trek universal translator thing Meta’s interviewees dreamt about, the Seamless team started with sorting out the data scarcity problem.
[...]
Warren Weaver, a mathematician and pioneer of machine translation, argued in 1949 that there might be a yet undiscovered universal language working as a common base of human communication.
[...]
Machines do not understand words as humans do. To make sense of them, they need to first turn them into sequences of numbers that represent their meaning.
[...]
When you vectorize aligned text in two languages like those European Parliament proceedings, you end up with two separate vector spaces, and then you can run a neural net to learn how those two spaces map onto each other.

But the Meta team didn’t have those nicely aligned texts for all the languages they wanted to cover. So, they vectorized all texts in all languages as if they were just a single language and dumped them into one embedding space called SONAR (Sentence-level Multimodal and Language-Agnostic Representations).
[...]
The team just used huge amounts of raw data—no fancy human labeling, no human-aligned translations. And then, the data mining magic happened.

SONAR embeddings represented entire sentences instead of single words. Part of the reason behind that was to control for differences between morphologically rich languages, where a single word may correspond to multiple words in morphologically simple languages. But the most important thing was that it ensured that sentences with similar meaning in multiple languages ended up close to each other in the vector space.
[...]
The Seamless team suddenly got access to millions of aligned texts, even in low-resource languages, along with thousands of hours of transcribed audio. And they used all this data to train their next-gen translator.
[...]
The Nature paper published by Meta’s Seamless ends at the SEAMLESSM4T models, but Nature has a long editorial process to ensure scientific accuracy. The paper published on January 15, 2025, was submitted in late November 2023. But in a quick search of the arXiv.org [arxiv.org], a repository of not-yet-peer-reviewed papers, you can find the details of two other models that the Seamless team has already integrated on top of the SEAMLESSM4T: SeamlessStreaming and SeamlessExpressive, which take this AI even closer to making a Star Trek universal translator a reality.

SeamlessStreaming is meant to solve the translation latency problem.
[...]
SeamlessStreaming was designed to take this experience a bit closer to what human simultaneous translator do—it translates what you’re saying as you speak in a streaming fashion. SeamlessExpressive, on the other hand, is aimed at preserving the way you express yourself in translations.
[...]
Sadly, it still can’t do both at the same time; you can only choose to go for either streaming or expressivity, at least at the moment. Also, the expressivity variant is very limited in supported languages—it only works in English, Spanish, French, and German. But at least it’s online so you can go ahead and give it a spin [metademolab.com].

Related stories on SoylentNews:
“AI Took My Job, Literally”—Gizmodo Fires Spanish Staff Amid Switch to AI Translator [soylentnews.org] - 20230906
Tokyo Tests Automated, Simultaneous Translation at Railway Station [soylentnews.org] - 20230805
AI Localization Tool Claims to Translate Your Words in Your Voice [soylentnews.org] - 20201017
The Shallowness of Google Translate [soylentnews.org] - 20180202
Survey Says AI Will Exceed Human Performance in Many Occupations Within Decades [soylentnews.org] - 20170701
Google Upgrades Chinese-English Translation with "Neural Machine Translation" [soylentnews.org] - 20160929
Android Marshmallow Has a Hidden Feature: Universal Translation [soylentnews.org] - 20151012


Original Submission