|
18 April 2024 -- Microsoft Research has announced VASA-1. The first AI-generated video that looks super real.
It takes a single portrait photo and adds speech audio and produces a hyper-realistic talking face video with precise lip-audio sync, life-like facial behavior, and naturalistic head movements, generated in real-time.
The AI-generated video looks very real.
Of course, the examples now being released are likely to be cherry-picked, but this is still amazing.
No, the tech is not “perfect” - yet - says an AI friend I spoke with, who’s trained to spot the tells. But it’s getting there.
He said it’s hard to describe in words, but these type of AI-driven videos always show a somewhat invisible (to the naked eye) “net” that you see pulling across the face whenever a non-neutral expression is made. It’s like it’s all being pulled together, so it feels uniformly stretched.
Obviously the fraud implications are simply ... 😱 🤯. But some of the other use cases that have been suggested are:
- reviving dead actors for new movie roles
- bringing back dead relatives to "chat" with you
- synthetic video calls and chatbots
To see all the examples that Microsoft has released, Eduardo Borges has posted them in a thread on Twitter and you can watch them by clicking here.
The full Microsoft news release and paper can be been accessed by linking here.
And you can read the detailed tech paper on arxiv by clicking here.
|