Stable Video Diffusion is now available through Stability AI API

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.

Stability AI, the company known for Stable Diffusion text-to-image generator, has announced that its new foundation image-to-video model, Stable Video Diffusion (SVD), is now available on its developer platform and through its application programming interface (API), allowing third-party developers to incorporate it into their own apps, websites, software and services.

“This new addition provides programmatic access to the state-of-the-art video model designed for various sectors…Our aim with this release is to provide developers with an efficient way to seamlessly integrate advanced video generation into their products,” the company wrote in a blog post.

While the release can help enterprises looking to generate AI videos, it can also raise some concerns, given that Stability AI is already drawing flak for training its models on LAION-5B, the open-source AI dataset that has been found containing at least 1,008 instances of child sexual abuse material and was taken offline this week as a result.

Still, for individuals and enterprises looking to build generative video into their apps, Stability’s new SVD API plug-ins do provide one of the leading options in terms of quality, offering “2 seconds of video, comprising of 25 generated frames and 24 frames of FILM interpolation, within an average time of 41 seconds,” according to a post by Stability AI on its LinkedIn page. This may not be enough for major video campaigns, but it can surely come in handy for producing GIFs with specific messaging, including memes.

VB Event

The AI Impact Tour

Getting to an AI Governance Blueprint – Request an invite for the Jan 10 event.


Learn More

The offering takes on competitive video generation models from Runway and Pika Labs, the latter of which recently raised $55 million from Lightspeed Venture Partners and debuted a new web platform to generate and edit videos.

However, neither of these offerings have made their video generating AI models available through an API — you need to go directly to their respective websites and apps to use them, meaning that, for now at least, external developers can’t really build apps atop them or incorporating them.

Notably, Stability also plans to launch a user-facing web experience for its video generator, although there’s no word on when it will be available. The company is calling users to join the waitlist to become the first ones to try out the interface.

First, let’s understand does Stable Video Diffusion do

Announced nearly a month ago in research preview, Stable Video Diffusion allows users to generate MP4 videos by prompting with still images, including JPGs and PNGs.

Going by the samples shared by the company, the model does a decent job at producing the required clips but still sits at a nascent stage, generating only short videos lasting up to two seconds. This is even less than the four-second clips produced by research-centric video models. 

But of course, multiple video clips could be chained together to form a larger video.

Stability, on its part, claims that it can help in sectors such as advertising, marketing, TV, film and gaming. 

More interestingly, unlike the models released last month for probing and feedback, the one released recently can produce videos in multiple layouts and resolutions, including 1024×576, 768×768 and 576×1024. It also includes added capabilities like motion strength control and seed-based control, which allows developers to choose between repeatable or random generation.

Stability continues to race despite controversy

While the launch of Stable Video Diffusion does give enterprises an easy way to build video generation capabilities into their products, it also highlights that Stability AI is ready to race toward capturing the market even as some question the source of its training data.

Just recently, a report from the Stanford Internet Observatory found that the free LAION-5B dataset, which has been used to train popular AI text-to-image generators, including Stable Diffusion 1.5 (released by Runway and supported by Stability), contains at least 1,008 instances of child sexual abuse material. The publisher, LAION, has now taken down the dataset.

Even earlier this year, the company was named in a class-action lawsuit that alleged that the company paid LAION to acquire “copies of billions of copyrighted images without permission to create Stable Diffusion.”

Currently, Stability’s developer platform API provides access to all company models, right from the Stable Diffusion XL text-to-image generator to the new SVD model. The company also offers a membership to help customers host the models locally.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


Leave a Comment