Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday May 28 2019, @03:38AM   Printer-friendly
from the cheese! dept.

Currently to get a realistic Deep Fake, shots from multiple angles are needed. Russian researchers have now taken this a step further, generating realistic video sequences based off a single photo.

Researchers trained the algorithm to understand facial features' general shapes and how they behave relative to each other, and then to apply that information to still images. The result was a realistic video sequence of new facial expressions from a single frame.

As a demonstration, they provide details and synthesized video sequences of historical figures such as Albert Einstein and Salvador Dali, as well as sequences based on paintings such as the Mona Lisa.

The authors are aware of the potential downsides of their technology and address this:

We realize that our technology can have a negative use for the so-called "deepfake" videos. However, it is important to realize, that Hollywood has been making fake videos (aka "special effects") for a century, and deep networks with similar capabilities have been available for the past several years (see links in the paper). Our work (and quite a few parallel works) will lead to the democratization of the certain special effects technologies. And the democratization of the technologies has always had negative effects. Democratizing sound editing tools lead to the rise of pranksters and fake audios, democratizing video recording lead to the appearance of footage taken without consent. In each of the past cases, the net effect of democratization on the World has been positive, and mechanisms for stemming the negative effects have been developed. We believe that the case of neural avatar technology will be no different. Our belief is supported by the ongoing development of tools for fake video detection and face spoof detection alongside with the ongoing shift for privacy and data security in major IT companies.

While it works with as few as one frame to learn from, the technology benefits in accuracy and 'identity preservation' from having multiple frames available. This becomes obvious when observing the synthesized Mona Lisa sequences, which, while accurate to the original, appear to be essentially three different individuals to the human eye watching them.

Journal Reference: https://arxiv.org/abs/1905.08233v1

Related Coverage
Most Deepfake Videos Have One Glaring Flaw: A Lack of Blinking
My Struggle With Deepfakes
Discord Takes Down "Deepfakes" Channel, Citing Policy Against "Revenge Porn"
AI-Generated Fake Celebrity Porn Craze "Blowing Up" on Reddit
As Fake Videos Become More Realistic, Seeing Shouldn't Always be Believing


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by takyon on Tuesday May 28 2019, @04:37AM (1 child)

    by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Tuesday May 28 2019, @04:37AM (#848414) Journal

    I haven't even watched the video yet, but the stills are great. Some will whine about fake news and elections, but this is a really powerful tool for pro or amateur artists.

    There's some previous work somewhere on Two Minute Papers [youtube.com] that is similar, but this seems to be more polished.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by HiThere on Tuesday May 28 2019, @04:14PM

    by HiThere (866) Subscriber Badge on Tuesday May 28 2019, @04:14PM (#848540) Journal

    Welllll.....:
    "Some will whine about fake news and elections, but this is a really powerful tool for pro or amateur artists."
    is true, but I hope what you meant was more like:
    "Some will whine about fake news and elections, but it's also true that this is a really powerful tool for pro or amateur artists."

    Text communications are more ambiguous than vocal, because the vocal stresses let the meaning be clearer. Both effects are happening, and neither is unimportant.

    --
    Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.