Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by on Monday April 24 2017, @03:49PM   Printer-friendly
from the Is-it-live-or-is-it... dept.

According to a story appearing in The Economist, several companies are developing software that can, with only a relatively short sample of a person's speech, produce a "clone" of their saying nearly anything. CandyVoice is a phone app developed by a new Parisian company that only needs 160 or so French or English phrases from which it can extract sufficient information to read plain text in that person's voice. Carnegie Mellon University has a similar program called Festvox. The Chinese internet giant, Baidu, claims it has software which needs only about 50 sentences. And now, Vivotext, a voice-cloning firm headed by Gershon Silbert in Tel Aviv, looks to expand on that as it licenses its software to Hasbro.

More troubling, any voice—including that of a stranger—can be cloned if decent recordings are available on YouTube or elsewhere. Researchers at the University of Alabama, Birmingham, led by Nitesh Saxena, were able to use Festvox to clone voices based on only five minutes of speech retrieved online. When tested against voice-biometrics software like that used by many banks to block unauthorised access to accounts, more than 80% of the fake voices tricked the computer. Alan Black, one of Festvox's developers, reckons systems that rely on voice-ID software are now "deeply, fundamentally insecure".

And, lest people get smug about the inferiority of machines, humans have proved only a little harder to fool than software is. Dr Saxena and his colleagues asked volunteers if a voice sample belonged to a person whose real speech they had just listened to for about 90 seconds. The volunteers recognised cloned speech as such only half the time (ie, no better than chance). The upshot, according to George Papcun, an expert witness paid to detect faked recordings produced as evidence in court, is the emergence of a technology with "enormous potential value for disinformation". Dr Papcun, who previously worked as a speech-synthesis scientist at Los Alamos National Laboratory, a weapons establishment in New Mexico, ponders on things like the ability to clone an enemy leader's voice in wartime.

Efforts are now afoot to develop voice cloning detection software to identify whether a voice sample is genuine or machine-created.

[...] Nuance Communications, a maker of voice-activated software, is working on algorithms that detect tiny skips in frequency at the points where slices of speech are stuck together. Adobe, best known as the maker of Photoshop, an image-editing software suite, says that it may encode digital watermarks into speech fabricated by a voice-cloning feature called VoCo it is developing. Such wizardry may help computers flag up suspicious speech. Even so, it is easy to imagine the mayhem that might be created in a world which makes it easy to put authentic-sounding words into the mouths of adversaries—be they colleagues or heads of state.

Ever since I read "The Moon is a Harsh Mistress" by Robert Heinlein, I've been aware of the risks of assuming that the voice I hear is actually that of the person that I think it is. That was further confirmed when, as a teenager, callers to our home phone would be unable to distinguish between my Dad, my brother, and myself by voice alone.

How soon will it be that audio, pictures, and videos will be able to be manipulated to such a degree that it becomes impossible to detect an original from a manipulated version? How will that affect evidence introduced in court? Where will things go from here?

Previously:
Adobe Voco 'Photoshop-for-Voice' Causes Concern


Original Submission

Related Stories

Adobe Voco 'Photoshop-for-Voice' Causes Concern 20 comments

A new application that promises to be the "Photoshop of speech" is raising ethical and security concerns. Adobe unveiled Project Voco last week. The software makes it possible to take an audio recording and rapidly alter it to include words and phrases the original speaker never uttered, in what sounds like their voice.

One expert warned that the tech could further undermine trust in journalism. Another said it could pose a security threat. However, the US software firm says it is taking action to address such risks.

[...] "It seems that Adobe's programmers were swept along with the excitement of creating something as innovative as a voice manipulator, and ignored the ethical dilemmas brought up by its potential misuse," he told the BBC. "Inadvertently, in its quest to create software to manipulate digital media, Adobe has [already] drastically changed the way we engage with evidential material such as photographs.

"This makes it hard for lawyers, journalists, and other professionals who use digital media as evidence.

"In the same way that Adobe's Photoshop has faced legal backlash after the continued misuse of the application by advertisers, Voco, if released commercially, will follow its predecessor with similar consequences."

The risks extend beyond people being fooled into thinking others said something they did not. Banks and other businesses have started using voiceprint checks to verify customers are who they say they are when they phone in. One cybersecurity researcher said the companies involved had long anticipated something like Adobe's invention.


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 1, Interesting) by Anonymous Coward on Monday April 24 2017, @04:10PM (2 children)

    by Anonymous Coward on Monday April 24 2017, @04:10PM (#498908)

    So what we're seeing is that for a brief moment, there were video cameras everywhere. People on the ground when something happened became instant journalists, recording events as they were, much to the chagrin of corporations and governments.

    Then, one day, ubiquitous video recording, which hadn't been available decades before, suddenly became untrustworthy. And we're back to where we started.

    • (Score: 0) by Anonymous Coward on Monday April 24 2017, @04:20PM

      by Anonymous Coward on Monday April 24 2017, @04:20PM (#498916)

      The Butlers were on to something I think. Technology is fun, but unless we're very responsible then it looks like we're gonna have some serious "fun" as we go through a round of adjustment. Said adjustment will of course involve nasty methods used to gain control of humanity and then we'll have the inevitable revolution. Personally I would like to skip the horror part and have the world listen to the smart people saying these things are a bad idea. The premise of "us vs. them" is so very wrong, we need to manage our planet's resources as a species and get over this dumb nation state crap.

    • (Score: 0) by Anonymous Coward on Monday April 24 2017, @07:10PM

      by Anonymous Coward on Monday April 24 2017, @07:10PM (#499004)

      Then, one day, ubiquitous video recording, which hadn't been available decades before, suddenly became untrustworthy.

      Authenticating video is theoretically easy enough - have the camera do a digital signature of the original footage. No signature, no trust.
      Take it one step further and have the camera automatically put something like an HMAC of the video into a public blockchain in near real-time to reduce the window of time available for tampering. Sure, nothing is 100% trustworthy. But film never was either.

  • (Score: 1, Interesting) by Anonymous Coward on Monday April 24 2017, @04:26PM (1 child)

    by Anonymous Coward on Monday April 24 2017, @04:26PM (#498919)

    We already have a source of not completely reliable information: Witness testimony. When recordings will be as manipulable as witness testimonies, then surely similar measures will be applied to counteract:

    • Question the origin of the material, who was in contact with the recording device, transmission lines and/or any storage media, and what ability and motives any such person might he have to possibly change the data.
    • Examine the material for inconsistencies, not only internal, but especially inconsistencies with other material also presented in the same case (for example, the shop camera shows the accused breaking into the shop, but the security camera from the next building, which happens to also cover the shop entry, doesn't show it; obviously at least one of the recordings has been manipulated).

    But there will probably also be counter measures of technical nature. For example, the security camera might immediately sign any footage with a private key generated by and stored only on that camera; without access to the camera, forging footage is not possible. That already would massively reduce the number of people who could forge the material.

    • (Score: 0) by Anonymous Coward on Monday April 24 2017, @05:40PM

      by Anonymous Coward on Monday April 24 2017, @05:40PM (#498956)

      Then along comes an IoT exploit for the camera, and it's back to Mousetrap 101 again.

  • (Score: 1, Offtopic) by VLM on Monday April 24 2017, @04:29PM (2 children)

    by VLM (445) on Monday April 24 2017, @04:29PM (#498923)

    Apparently its improved a bit.

    Around the turn of the century before I got married back in the late 90s when single speed cd burners were no longer new but were still pretty new-ish, I downloaded some gutenberg texts using my modem SLIP connection to an ISP (I had that legacy account from before POP became popular, or perhaps before POP existed...) and back then festival ran at much slower than real time, but I got my & sign and plenty of KWh so I converted some old story, don't remember exactly, maybe heart of darkness by conrad? into wav files, then converted the wav files into audio tracks and burned a legacy old fashioned cd audio disk.

    Technically it worked, but was unlistenable. It was, uh, not like listening to Alexa or Siri. Or even as good as MoonMan (NSFW).

    I am amazed at Alexa's speaking ability. She can read me wikipedia articles quite well, along with weather reports.

    • (Score: 0) by Anonymous Coward on Monday April 24 2017, @04:41PM

      by Anonymous Coward on Monday April 24 2017, @04:41PM (#498935)

      FTFY

    • (Score: 5, Funny) by Anonymous Coward on Monday April 24 2017, @07:02PM

      by Anonymous Coward on Monday April 24 2017, @07:02PM (#499000)

      Today I learned:
      VLM wants us to think he's married.
      That he used to own a modem.
      That he thinks serial line ip has something to do with the post office protocol.
      That he lets amazon spy on him with alexa.

      I did not learn what any of that has to do with forging voice prints. But that wasn't really the point. Nope, just a lonely guy in a basement somewhere crying out to the world, "I exist! Pay attention to me!"

  • (Score: 1, Funny) by Anonymous Coward on Monday April 24 2017, @04:45PM (2 children)

    by Anonymous Coward on Monday April 24 2017, @04:45PM (#498939)

    HiMyNmIsWrnrBrnds

    *** Entry denied ***
    Please speak more slowly

    Hi. My name is Werner Brandes. My voice is my passport. Verify me.

    *** Thank you ***

    • (Score: 0) by Anonymous Coward on Monday April 24 2017, @05:43PM (1 child)

      by Anonymous Coward on Monday April 24 2017, @05:43PM (#498957)
      • (Score: 0) by Anonymous Coward on Monday April 24 2017, @06:17PM

        by Anonymous Coward on Monday April 24 2017, @06:17PM (#498973)

        You got it

  • (Score: 1, Insightful) by Anonymous Coward on Monday April 24 2017, @07:43PM (1 child)

    by Anonymous Coward on Monday April 24 2017, @07:43PM (#499020)

    They are all bullshit because these are not unique. Even DNA forebics is crap - nature didn't evolve to make DNA to be cryptographic hash.

    • (Score: 2) by Immerman on Tuesday April 25 2017, @01:42PM

      by Immerman (3985) on Tuesday April 25 2017, @01:42PM (#499287)

      You don't need a cryptographic hash - DNA is varied enough, that it's pretty much guaranteed that nobody except identical twins has the same thing. And even they may have different mutations.

      The problem with using it for forensics is that nobody spends the money to exhaustively sequence it for comparison, so instead of getting a conclusive does/doesn't match, you get the equivalent of "these two blurry photos look like they might be of the same person"

  • (Score: 1) by DmT on Monday April 24 2017, @08:16PM

    by DmT (6439) on Monday April 24 2017, @08:16PM (#499032)

    Does anybody have some good software that can read articles to me?
    Especially I would be interested in the type that is so realistic that half of the time it would like listening to a real person.

    Currently, what I have is quite bad and really difficult to listen to, because the TTS is so monotone and boring.

  • (Score: 1, Insightful) by Anonymous Coward on Monday April 24 2017, @08:20PM (1 child)

    by Anonymous Coward on Monday April 24 2017, @08:20PM (#499036)

    Qu'on me donne six lignes dites de la bouche du plus honnête homme, j'y trouverai de quoi le faire pendre.

  • (Score: 0) by Anonymous Coward on Monday April 24 2017, @08:39PM

    by Anonymous Coward on Monday April 24 2017, @08:39PM (#499042)

    trump told me to bomb saudi arabia.

  • (Score: 2) by ledow on Monday April 24 2017, @09:39PM

    by ledow (5567) on Monday April 24 2017, @09:39PM (#499057) Homepage

    What idiot is using their voice pattern as their password?

    Sod all the rubbish in between, someone just has to record their voice to get complete access to whatever it was protecting, at any point in the future, and they CAN'T CHANGE IT.

    Stupid idea, let's hope software like this kills it overnight and we start using something actually SECURE.

    (Hint: Fingerprints, etc. all the same. If you can't change it, it's not security).

  • (Score: 0) by Anonymous Coward on Tuesday April 25 2017, @04:27AM

    by Anonymous Coward on Tuesday April 25 2017, @04:27AM (#499143)

    I recall hearing about IBM doing something like this (probably with looser specs, more input data) a decade ago. I guess society doesn't debate things until they have been mass produced at low cost. Too bad.

  • (Score: 2) by butthurt on Tuesday April 25 2017, @04:58AM

    by butthurt (6141) on Tuesday April 25 2017, @04:58AM (#499147) Journal
  • (Score: 0) by Anonymous Coward on Tuesday April 25 2017, @08:14AM (1 child)

    by Anonymous Coward on Tuesday April 25 2017, @08:14AM (#499186)

    Download our software package. Set up a sex-talk line. Watch the dollars flow.

    Now, you too can be making $2.99 to $4.99 per minute as callers talk to your server!

    Yessir. There are people out there with money to spend, and here's your chance to get your share of it!

    Just call (548kIksbe*4KmsY^n NO CARRIER

    • (Score: 0) by Anonymous Coward on Wednesday April 26 2017, @02:40AM

      by Anonymous Coward on Wednesday April 26 2017, @02:40AM (#499794)

      The sad thing to watch will be how long these types of startups actually exist and make money, despite how obviously easily people could find cheaper ways to achieve the same result (including running the software at home). After all, porn shops with dvd's and whatnot still exist despite the widespread abundance of easily accessible internet porn.

      Humans are a depressing species to watch sometimes.

(1)