Stories
Slash Boxes
Comments

SoylentNews is people

posted by on Monday April 24 2017, @03:49PM   Printer-friendly
from the Is-it-live-or-is-it... dept.

According to a story appearing in The Economist, several companies are developing software that can, with only a relatively short sample of a person's speech, produce a "clone" of their saying nearly anything. CandyVoice is a phone app developed by a new Parisian company that only needs 160 or so French or English phrases from which it can extract sufficient information to read plain text in that person's voice. Carnegie Mellon University has a similar program called Festvox. The Chinese internet giant, Baidu, claims it has software which needs only about 50 sentences. And now, Vivotext, a voice-cloning firm headed by Gershon Silbert in Tel Aviv, looks to expand on that as it licenses its software to Hasbro.

More troubling, any voice—including that of a stranger—can be cloned if decent recordings are available on YouTube or elsewhere. Researchers at the University of Alabama, Birmingham, led by Nitesh Saxena, were able to use Festvox to clone voices based on only five minutes of speech retrieved online. When tested against voice-biometrics software like that used by many banks to block unauthorised access to accounts, more than 80% of the fake voices tricked the computer. Alan Black, one of Festvox's developers, reckons systems that rely on voice-ID software are now "deeply, fundamentally insecure".

And, lest people get smug about the inferiority of machines, humans have proved only a little harder to fool than software is. Dr Saxena and his colleagues asked volunteers if a voice sample belonged to a person whose real speech they had just listened to for about 90 seconds. The volunteers recognised cloned speech as such only half the time (ie, no better than chance). The upshot, according to George Papcun, an expert witness paid to detect faked recordings produced as evidence in court, is the emergence of a technology with "enormous potential value for disinformation". Dr Papcun, who previously worked as a speech-synthesis scientist at Los Alamos National Laboratory, a weapons establishment in New Mexico, ponders on things like the ability to clone an enemy leader's voice in wartime.

Efforts are now afoot to develop voice cloning detection software to identify whether a voice sample is genuine or machine-created.

[...] Nuance Communications, a maker of voice-activated software, is working on algorithms that detect tiny skips in frequency at the points where slices of speech are stuck together. Adobe, best known as the maker of Photoshop, an image-editing software suite, says that it may encode digital watermarks into speech fabricated by a voice-cloning feature called VoCo it is developing. Such wizardry may help computers flag up suspicious speech. Even so, it is easy to imagine the mayhem that might be created in a world which makes it easy to put authentic-sounding words into the mouths of adversaries—be they colleagues or heads of state.

Ever since I read "The Moon is a Harsh Mistress" by Robert Heinlein, I've been aware of the risks of assuming that the voice I hear is actually that of the person that I think it is. That was further confirmed when, as a teenager, callers to our home phone would be unable to distinguish between my Dad, my brother, and myself by voice alone.

How soon will it be that audio, pictures, and videos will be able to be manipulated to such a degree that it becomes impossible to detect an original from a manipulated version? How will that affect evidence introduced in court? Where will things go from here?

Previously:
Adobe Voco 'Photoshop-for-Voice' Causes Concern


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Interesting) by Anonymous Coward on Monday April 24 2017, @04:26PM (1 child)

    by Anonymous Coward on Monday April 24 2017, @04:26PM (#498919)

    We already have a source of not completely reliable information: Witness testimony. When recordings will be as manipulable as witness testimonies, then surely similar measures will be applied to counteract:

    • Question the origin of the material, who was in contact with the recording device, transmission lines and/or any storage media, and what ability and motives any such person might he have to possibly change the data.
    • Examine the material for inconsistencies, not only internal, but especially inconsistencies with other material also presented in the same case (for example, the shop camera shows the accused breaking into the shop, but the security camera from the next building, which happens to also cover the shop entry, doesn't show it; obviously at least one of the recordings has been manipulated).

    But there will probably also be counter measures of technical nature. For example, the security camera might immediately sign any footage with a private key generated by and stored only on that camera; without access to the camera, forging footage is not possible. That already would massively reduce the number of people who could forge the material.

    Starting Score:    0  points
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  

    Total Score:   1  
  • (Score: 0) by Anonymous Coward on Monday April 24 2017, @05:40PM

    by Anonymous Coward on Monday April 24 2017, @05:40PM (#498956)

    Then along comes an IoT exploit for the camera, and it's back to Mousetrap 101 again.