Stories
Slash Boxes
Comments

SoylentNews is people

posted by on Monday April 24 2017, @03:49PM   Printer-friendly
from the Is-it-live-or-is-it... dept.

According to a story appearing in The Economist, several companies are developing software that can, with only a relatively short sample of a person's speech, produce a "clone" of their saying nearly anything. CandyVoice is a phone app developed by a new Parisian company that only needs 160 or so French or English phrases from which it can extract sufficient information to read plain text in that person's voice. Carnegie Mellon University has a similar program called Festvox. The Chinese internet giant, Baidu, claims it has software which needs only about 50 sentences. And now, Vivotext, a voice-cloning firm headed by Gershon Silbert in Tel Aviv, looks to expand on that as it licenses its software to Hasbro.

More troubling, any voice—including that of a stranger—can be cloned if decent recordings are available on YouTube or elsewhere. Researchers at the University of Alabama, Birmingham, led by Nitesh Saxena, were able to use Festvox to clone voices based on only five minutes of speech retrieved online. When tested against voice-biometrics software like that used by many banks to block unauthorised access to accounts, more than 80% of the fake voices tricked the computer. Alan Black, one of Festvox's developers, reckons systems that rely on voice-ID software are now "deeply, fundamentally insecure".

And, lest people get smug about the inferiority of machines, humans have proved only a little harder to fool than software is. Dr Saxena and his colleagues asked volunteers if a voice sample belonged to a person whose real speech they had just listened to for about 90 seconds. The volunteers recognised cloned speech as such only half the time (ie, no better than chance). The upshot, according to George Papcun, an expert witness paid to detect faked recordings produced as evidence in court, is the emergence of a technology with "enormous potential value for disinformation". Dr Papcun, who previously worked as a speech-synthesis scientist at Los Alamos National Laboratory, a weapons establishment in New Mexico, ponders on things like the ability to clone an enemy leader's voice in wartime.

Efforts are now afoot to develop voice cloning detection software to identify whether a voice sample is genuine or machine-created.

[...] Nuance Communications, a maker of voice-activated software, is working on algorithms that detect tiny skips in frequency at the points where slices of speech are stuck together. Adobe, best known as the maker of Photoshop, an image-editing software suite, says that it may encode digital watermarks into speech fabricated by a voice-cloning feature called VoCo it is developing. Such wizardry may help computers flag up suspicious speech. Even so, it is easy to imagine the mayhem that might be created in a world which makes it easy to put authentic-sounding words into the mouths of adversaries—be they colleagues or heads of state.

Ever since I read "The Moon is a Harsh Mistress" by Robert Heinlein, I've been aware of the risks of assuming that the voice I hear is actually that of the person that I think it is. That was further confirmed when, as a teenager, callers to our home phone would be unable to distinguish between my Dad, my brother, and myself by voice alone.

How soon will it be that audio, pictures, and videos will be able to be manipulated to such a degree that it becomes impossible to detect an original from a manipulated version? How will that affect evidence introduced in court? Where will things go from here?

Previously:
Adobe Voco 'Photoshop-for-Voice' Causes Concern


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.