████ # This file was generated bot-o-matically! Edit at your own risk. ████
Deepfake videos are getting shockingly good [techcrunch.com]:
Researchers from TikTok owner ByteDance have demoed a new AI system, OmniHuman-1 [github.io], that can generate perhaps the most realistic deepfake videos to date.
Deepfaking AI is a commodity. There’s no shortage of apps that can insert someone into a photo, or make a person appear to say something they didn’t actually say. But most deepfakes — and video deepfakes in particular — fail to clear the uncanny valley. There’s usually some tell or obvious sign that AI was involved somewhere.
Not so with OmniHuman-1 — at least from the cherry-picked samples the ByteDance team released.
Here’s a fictional Taylor Swift performance:
Here’s a TED Talk that never took place:
And here’s a deepfaked Einstein lecture:
According to the ByteDance researchers, OmniHuman-1 only needs a single reference image and audio, like speech or vocals, to generate a clip of an arbitrary length. The output video’s aspect ratio is adjustable, as is the subject’s “body proportion” — i.e. how much of their body is shown in the fake footage.
Trained on 19,000 hours of video content from undisclosed sources, OmniHuman-1 can also edit existing videos — even modifying the movements of a person’s limbs. It’s truly astonishing how convincing the result can be.
Granted, OmniHuman-1 isn’t perfect. The ByteDance team says that “low-quality” reference images won’t yield the best videos, and the system seems to struggle with certain poses. Note the weird gestures with the wine glass in this video: