Meta says its new speech-generating AI tool is too dangerous to release:
Meta has unveiled a new AI tool, dubbed 'Voicebox', which it claims represents a breakthrough in AI-powered speech generation. However, the company won't be unleashing it on the public just yet - because doing so could be disastrous.
Voicebox is currently able to produce audio clips of speech in six languages (all of which are European of origin), and - according to a blog post from Meta - is the first AI model of its kind capable of completing tasks beyond what it was 'specifically trained to accomplish'. Meta claims that Voicebox handily outperforms competing speech-generation AIs in virtually every area.
So what exactly is it capable of? Well, for starters, it can spew out reasonably accurate text-to-speech replications of a person's voice using a sample audio file as short as two seconds, a seemingly innocuous ability that holds a huge amount of destructive potential in the wrong hands.
[...] Meta clearly believes its new tool is good enough to fool at least the majority of people [since] it's explicitly not releasing Voicebox to the public, but instead publishing a research paper and detailing a classifier tool that can identify Voicebox-generated speech from real human speech. Meta describes the classifier as "highly effective" - though notably not perfectly effective.
[...] A little caution, patience, and respect for the magnitude of this technology is a welcome sight - although I doubt Meta will sit on Voicebox for too long, since the shareholders will no doubt be wondering how much money it can make them...
Some of you may know Ken Starks as an advocate for those people (especially kids) who can't afford a computer of their own. In the process of placing used computers with those folks, Ken and his organization use Linux to cut costs and to avoid proprietary gotchas. As such, you may also know him as an advocate for FOSS.
I have mentioned previously that earlier this year, in his ongoing bout with cancer, Ken had his larynx removed.
As a kid, Ken had a frightening experience due to a guy using one of those buzzers pressed against his throat to "speak". Ken doesn't want any part of freaking out any kids in that way. He has been looking for an out-of-the-box text-to-speech app that runs under Linux.
What Ken has found is that, while there are several nice text-to-speech apps for Android, the state of text-to-speech for Desktop Linux is very sad. The developers of FOSS TTS apps tend to reach a state of suits-my-needs and leave it there.
Someone who has a bit of experience setting up software, is adept with a search engine, and has patience can install one of the existing FOSS TTS apps and get that working at a useful level. A nice out-of-the-box experience for Joe Average, however, has been lacking up to now. Ken has been trying to find developers who can bring MaryTTS to a state of usefulness and ease that will make it on par with the experience you would expect from a payware app.
Via his column at FOSS Force, Ken now reports:
Developers at business AI company Dessa have come up with a new text-to-speech system called "RealTalk". In the version they demoed, it was trained to speak with the voice of popular podcaster Joe Rogan. The developers have put up a site with a blind test at http://fakejoerogan.com/. They must have been so impressed by their own creation that they discuss the implications at https://medium.com/@dessa_/real-talk-speech-synthesis-5dd0897eef7f.
Your humble submitter did the blind test and just barely had a majority of correct guesses, but was so impressed by the quality that he considered it newsworthy - how do you fare in the test?
(Score: 0) by Anonymous Coward on Monday June 26, @08:07PM
Our new product is so damn good it'll fuck up your entire life and steal gramma's life savings!
I hope the full tool leaks and MetaBook goes bankrupt. The allegedly dangerous levels of voice synthesis will be open source sooner rather than later. Bring on the chaos.
(Score: 2) by istartedi on Monday June 26, @08:24PM
I think they're most likely concerned they'll get sued by actors, voice-over and otherwise. They'll take their time to line up a proper defense, evaluate the cost, and most likely release it with some kind of agreement that purports to absolve them of liability.
