Have you ever wanted to know what two golden retrievers podcasting on top of a mountain might look like? Or perhaps watch a bicycle race on the ocean with different animals riding the bicycles?
Now you can. OpenAI’s latest generative artificial intelligence offering, Sora, can generate breathtakingly realistic videos that are up to a minute long from text prompts. OpenAI CEO Sam Altman announced the model’s creation on X on Thursday.
Sora is not yet available to the public. For now, OpenAI is only granting access to red teamers—individuals employed to look for issues—who will assess potential risks associated with the model’s release, as well as a limited number of “visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals,” according to a blog post.
The post displays a number of videos generated by Sora. To further show off Sora’s capabilities, Altman invited users on X to suggest prompts from which it would generate videos. They responded by requesting, among others, a video of “instructional cooking session for homemade gnocchi hosted by a grandmother social media influencer set in a rustic Tuscan country kitchen with cinematic lighting,” and a “half duck half dragon flies through a beautiful sunset with a hamster dressed in adventure gear on its back.” Many X users expressed their excitement about the new technology.
OpenAI is the company behind the groundbreaking chatbot ChatGPT. It also produced the popular image-generation AI model Dall-E. Until the Sora announcement, the leading text-to-video AI model was developed by Brooklyn-based Runway. Runway’s most advanced model, Gen-2, was announced in March 2023. The videos it produced were choppy, short, and often nightmarish. Some users marveled at the superiority of Sora’s videos, noting the pace of AI progress in less than one year. Runway CEO and co-founder Cristóbal Valenzuela posted “game on” on X in response to OpenAI’s announcement.
Sora lands at a critical moment for these programs. Experts have expressed concern that AI-generated content could be used to wrongly influence elections or otherwise sow confusion worldwide. The World Economic Forum’s Global Risks Report 2024 listed AI-generated misinformation and disinformation as the most significant risk facing the world in 2024. In addition to working with red teamers to identify risks Sora could pose, OpenAI is building classifiers that could alert users if a video was generated by Sora and plan to include C2PA metadata, imperceptible additions to the files containing AI-generated content that would allow provenance verification, if they deploy Sora in a product, according to the blog post accompanying Sora’s release.
OpenAI follows a doctrine it refers to as “iterative deployment,” in which it releases AI models while they are still relatively primitive compared to what they might be in a few years time, in order to allow society to adjust to the new technology. The Sora blog post reaffirms this doctrine, stating that “we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time.”
The technical report released by OpenAI does not disclose the data that Sora was trained on, a practice that has become increasingly common in the AI industry amid an increasing number of lawsuits filed against AI companies accusing them of training their models on other people’s data without their consent. Visual artists have sued AI companies, and last year’s actors strike was in part motivated by fears that AI could replace actors and writers. OpenAI has drawn its share of controversy: late last year, the New York Times sued OpenAI for copyright infringement, and some AI researchers have criticized the company, claiming that it took work done by academic researchers who published openly without giving proper credit.