Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Friday February 23 2024, @11:12AM   Printer-friendly

OpenAI teases an amazing new generative video model called Sora [technologyreview.com]:

OpenAI has built a striking new generative video model called Sora that can take a short text description and turn it into a detailed, high-definition film clip up to a minute long.

Based on four sample videos that OpenAI shared with MIT Technology Review ahead of today's announcement, the San Francisco–based firm has pushed the envelope of what's possible with text-to-video generation (a hot new research direction that we flagged as a trend to watch in 2024).

"We think building models that can understand video, and understand all these very complex interactions of our world, is an important step for all future AI systems," says Tim Brooks, a scientist at OpenAI.

[...] Impressive as they are, the sample videos shown here were no doubt cherry-picked to show Sora at its best. Without more information, it is hard to know how representative they are of the model's typical output.

It may be some time before we find out. OpenAI's announcement of Sora today is a tech tease, and the company says it has no current plans to release it to the public. Instead, OpenAI will today begin sharing the model with third-party safety testers for the first time.

In particular, the firm is worried about the potential misuses [technologyreview.com] of fake but photorealistic video [technologyreview.com]. "We're being careful about deployment here and making sure we have all our bases covered before we put this in the hands of the general public," says Aditya Ramesh, a scientist at OpenAI, who created the firm's text-to-image model DALL-E [technologyreview.com].

But OpenAI is eyeing a product launch sometime in the future. As well as safety testers, the company is also sharing the model with a select group of video makers and artists to get feedback on how to make Sora as useful as possible to creative professionals. "The other goal is to show everyone what is on the horizon, to give a preview of what these models will be capable of," says Ramesh.

[...] OpenAI is well aware of the risks that come with a generative video model. We are already seeing the large-scale misuse of deepfake images [technologyreview.com]. Photorealistic video takes this to another level.

[...] The OpenAI team plans to draw on the safety testing it did last year for DALL-E 3. Sora already includes a filter that runs on all prompts sent to the model that will block requests for violent, sexual, or hateful images, as well as images of known people. Another filter will look at frames of generated videos and block material that violates OpenAI's safety policies.

OpenAI says it is also adapting a fake-image detector developed for DALL-E 3 to use with Sora. And the company will embed industry-standard C2PA tags, metadata that states how an image was generated, into all of Sora's output. But these steps are far from foolproof. Fake-image detectors are hit-or-miss. Metadata is easy to remove, and most social media sites strip it from uploaded images by default.

OpenAI Unveils A.I. That Instantly Generates Eye-Popping Videos:

In April, a New York start-up called Runway AI unveiled technology that let people generate videos, like a cow at a birthday party or a dog chatting on a smartphone, simply by typing a sentence into a box on a computer screen.

The four-second videos were blurry, choppy, distorted and disturbing. But they were a clear sign that artificial intelligence technologies would generate increasingly convincing videos in the months and years to come.

Just 10 months later, the San Francisco start-up OpenAI has unveiled a similar system that creates videos that look as if they were lifted from a Hollywood movie. A demonstration included short videos — created in minutes — of woolly mammoths trotting through a snowy meadow, a monster gazing at a melting candle and a Tokyo street scene seemingly shot by a camera swooping across the city.

OpenAI, the company behind the ChatGPT chatbot and the still-image generator DALL-E, is among the many companies racing to improve this kind of instant video generator, including start-ups like Runway and tech giants like Google and Meta, the owner of Facebook and Instagram. The technology could speed the work of seasoned moviemakers, while replacing less experienced digital artists entirely.

VideoThis video's A.I. prompt: "Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes."CreditCredit...Video by OpenAI

It could also become a quick and inexpensive way of creating online disinformation, making it even harder to tell what's real on the internet.

OpenAI introduces Sora, its text-to-video AI model:

OpenAI is launching a new video-generation model, and it's called Sora. The AI company says Sora "can create realistic and imaginative scenes from text instructions." The text-to-video model allows users to create photorealistic videos up to a minute long — all based on prompts they've written.

Sora is capable of creating "complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background," according to OpenAI's introductory blog post. The company also notes that the model can understand how objects "exist in the physical world," as well as "accurately interpret props and generate compelling characters that express vibrant emotions."

The model can also generate a video based on a still image, as well as fill in missing frames on an existing video or extend it. The Sora-generated demos included in OpenAI's blog post include an aerial scene of California during the gold rush, a video that looks as if it were shot from the inside of a Tokyo train, and others. Many have some telltale signs of AI — like a suspiciously moving floor in a video of a museum — and OpenAI says the model "may struggle with accurately simulating the physics of a complex scene," but the results are overall pretty impressive.

[...] Earlier this month, OpenAI announced it's adding watermarks to its text-to-image tool DALL-E 3, but notes that they can "easily be removed." Like its other AI products, OpenAI will have to contend with the consequences of fake, AI photorealistic videos being mistaken for the real thing.


Original Submission

Related Stories

Tyler Perry Puts $800 Million Studio Expansion on Hold Because of OpenAI's Sora 16 comments

https://arstechnica.com/information-technology/2024/02/i-just-dont-see-how-we-survive-tyler-perry-issues-hollywood-warning-over-ai-video-tech/

In an interview with The Hollywood Reporter published Thursday, filmmaker Tyler Perry spoke about his concerns related to the impact of AI video synthesis on entertainment industry jobs. In particular, he revealed that he has suspended a planned $800 million expansion of his production studio after seeing what OpenAI's recently announced AI video generator Sora can do.

"I have been watching AI very closely," Perry said in the interview. "I was in the middle of, and have been planning for the last four years... an $800 million expansion at the studio, which would've increased the backlot a tremendous size—we were adding 12 more soundstages. All of that is currently and indefinitely on hold because of Sora and what I'm seeing. I had gotten word over the last year or so that this was coming, but I had no idea until I saw recently the demonstrations of what it's able to do. It's shocking to me."

[...] "It makes me worry so much about all of the people in the business," he told The Hollywood Reporter. "Because as I was looking at it, I immediately started thinking of everyone in the industry who would be affected by this, including actors and grip and electric and transportation and sound and editors, and looking at this, I'm thinking this will touch every corner of our industry."

You can read the full interview at The Hollywood Reporter

[...] Perry also looks beyond Hollywood and says that it's not just filmmaking that needs to be on alert, and he calls for government action to help retain human employment in the age of AI. "If you look at it across the world, how it's changing so quickly, I'm hoping that there's a whole government approach to help everyone be able to sustain."

Previously on SoylentNews:
OpenAI Teases a New Generative Video Model Called Sora - 20240222

Toys “R” Us Riles Critics With “First-Ever” AI-Generated Commercial Using Sora 11 comments

https://arstechnica.com/information-technology/2024/06/toys-r-us-riles-critics-with-first-ever-ai-generated-commercial-using-sora/

On Monday, Toys "R" Us announced that it had partnered with an ad agency called Native Foreign to create what it calls "the first-ever brand film using OpenAI's new text-to-video tool, Sora." OpenAI debuted Sora in February, but the video synthesis tool has not yet become available to the public. The brand film tells the story of Toys "R" Us founder Charles Lazarus using AI-generated video clips.

"We are thrilled to partner with Native Foreign to push the boundaries of Sora, a groundbreaking new technology from OpenAI that's gaining global attention," wrote Toys "R" Us on its website. "Sora can create up to one-minute-long videos featuring realistic scenes and multiple characters, all generated from text instruction. Imagine the excitement of creating a young Charles Lazarus, the founder of Toys "R" Us, and envisioning his dreams for our iconic brand and beloved mascot Geoffrey the Giraffe in the early 1930s."

Previously on SoylentNews:
Tyler Perry Puts $800 Million Studio Expansion on Hold Because of OpenAI's Sora - 20240225
OpenAI Teases a New Generative Video Model Called Sora - 20240222
Toys 'R' Us Files for Bankruptcy Protection in US - 20170919 (Toys 'R' Us is a "zombie brand" now. The entity in Canada was and is separate and still exists.)


Original Submission

This discussion was created by hubie (1068) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 3, Touché) by PiMuNu on Friday February 23 2024, @12:17PM (1 child)

    by PiMuNu (3823) on Friday February 23 2024, @12:17PM (#1345834)

    > can understand video, and understand all these very complex interactions of our world

    Deliberate mis-selling.

    "Can cut and paste video in a sufficiently obfuscated way" is the correct statement

    • (Score: 5, Insightful) by choose another one on Friday February 23 2024, @02:01PM

      by choose another one (515) Subscriber Badge on Friday February 23 2024, @02:01PM (#1345845)

      Maybe, but it's getting much better at it awfully rapidly.

      End of the day, almost all video/film is edited and cut-and-pasted together (used to be literally) to get the desired effect, that skill is now being automated.

      I've already seen an analysis video of this which the youtube Algorithm threw into my list, I won't say it was scary but it was sufficiently interesting and engaging that I actually carried on watching it, which is unusual for random algo recs. https://www.youtube.com/watch?v=NXpdyAWLDas [youtube.com]

      I'm not a singularity fanboy or scaremonger, but I am now reasonably convince that we are heading somewhere quite different to today, and heading there very fast.

  • (Score: 3, Informative) by looorg on Friday February 23 2024, @02:33PM

    by looorg (578) on Friday February 23 2024, @02:33PM (#1345848)

    So is it on the level with Googles new Gemini? Cause that is apparently having some difficult relationship with history and reality when it comes to rendering things.

  • (Score: 2) by Rosco P. Coltrane on Friday February 23 2024, @04:31PM (3 children)

    by Rosco P. Coltrane (4757) on Friday February 23 2024, @04:31PM (#1345874)

    and you'll have a pretty good idea of what my face looks like when I read yet another fucking piece on fucking AI.

    • (Score: 2) by janrinok on Friday February 23 2024, @04:36PM (2 children)

      by janrinok (52) Subscriber Badge on Friday February 23 2024, @04:36PM (#1345878) Journal

      You might want to go outside for a walk on Saturday at least. We can only process the submissions that we get.

      --
      I am not interested in knowing who people are or where they live. My interest starts and stops at our servers.
      • (Score: 2) by Rosco P. Coltrane on Friday February 23 2024, @06:14PM (1 child)

        by Rosco P. Coltrane (4757) on Friday February 23 2024, @06:14PM (#1345914)

        It's not just here you know. You can't go to a public bathroom without hearing about AI lately.

        I vent here because why not.

        Also, I know you're just pushing submission to the front page. The criticism wasn't directed at SN: more to submitters, to submit a bit more diverse news for a change. There are interesting things other than AI...

        • (Score: 2) by janrinok on Friday February 23 2024, @06:32PM

          by janrinok (52) Subscriber Badge on Friday February 23 2024, @06:32PM (#1345917) Journal

          No offence taken, I can assure you! Editors have to have much thicker skin than that.

          However I will second your plea for more variety in more submissions please.

          --
          I am not interested in knowing who people are or where they live. My interest starts and stops at our servers.
  • (Score: 4, Informative) by DadaDoofy on Friday February 23 2024, @07:01PM (1 child)

    by DadaDoofy (23827) on Friday February 23 2024, @07:01PM (#1345925)

    Pro Tip: Be sure to do the tests that will save OpenAI the embarrassment of releasing an openly racist product, like Google Gemini.

    https://www.cnn.com/2024/02/22/tech/google-gemini-ai-image-generator/index.html [cnn.com]

(1)