OpenAI World Changer: Why Sora’s Unveiling is Historic

0

The world will never be the same after OpenAI revealed its new video creation model – Sora. Sora makes realistic videos – up to a minute long – from a short text prompt and the results are absolutely stunning. Check out the segment above to see what I mean. Sora can generate complex scenes with many characters, different types of motion plus fine details of the subjects and background. Sora not only understands the prompt, it understands how things exist in reality.


Prompt: The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope, dust kicks up from it’s tires, the sunlight shines on the SUV as it speeds along the dirt road, casting a warm glow over the scene. The dirt road curves gently into the distance, with no other cars or vehicles in sight. The trees on either side of the road are redwoods, with patches of greenery scattered throughout. The car is seen from the rear following the curve with ease, making it seem as if it is on a rugged drive through the rugged terrain. The dirt road itself is surrounded by steep hills and mountains, with a clear blue sky above with wispy clouds.

Why is it historic?

See for yourself in the batch of clips released by OpenAI. Just watch and you’ll realise their significance. The videos are incredible and mark the start of a new branch of AI where moving pictures will eventually become photo-real. In fact, although not perfect, this initial batch of Sora videos are good enough to fool many people. I know this because driving and flying simulators often trick punters on social media. And Sora is far better.

Base compute followed by 4x compute followed by 32x compute

This type of Artificial Intelligence requires a lot of computing power to render a realistic video. Sora is a diffusion model. It generates a video by starting off with one that appears to be static noise then eventually transforms it by removing that noise over several (many) steps.

It’s also very complicated.. as it should be given its capabilities. If you want to dig deeper, OpenAI has a fascinating research page on its website.

Why Reveal it Now?

Prompt: Historical footage of California during the gold rush

OpenAI has chosen to reveal Sora now to test the waters. It’s part excitement, part dread. OpenAI wants feedback from outsiders while giving the public an idea of what Sora can do.

It’s not available to the public yet, because extensive safety checks need to be done first. Sora will be accessible by red teamers (testers) to help assess harmful aspects or risks. Visual artists, designers, and filmmakers will also be asked to give feedback on how to improve Sora for creative professionals.

Sora AI generated moving eye

On the OpenAI website it says, “The text classifier will check and reject text input prompts that are in violation of usage policies, like those that request extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others.”

Given how realistic Sora videos are already, the technology will not doubt be abused by some people. OpenAI is building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora. 

The puppies fur isn’t quite right but it takes time to notice

They also developed robust image classifiers that are used to review the frames of every video generated to help ensure it sticks to usage policies, before it’s shown to the user.

So yes .. it’s exciting and scary at the same time. Because even if Open AI does the right thing – others may not. This fundamentally changes everything we believe on screen and while not perfect – it’s very already very good.

Is it real? Is it AI?

To combat this uncertainty OpenAI also plans to incorporate C2PA metadata to ensure a clear chain of transformation.

So mark February 2024 down as the month when the world changed forever.

LEAVE A REPLY

Please enter your comment!
Please enter your name here