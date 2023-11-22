Stability AI, a promising AI startup, has unveiled a groundbreaking product called Stable Video Diffusion. This AI model has the ability to transform existing images into videos by animating them, and it is one of the few video-generating models available for open-source use. The launch of Stable Video Diffusion comes at a time when the chaos at OpenAI is dominating the headlines, showcasing Stability AI’s resilience and focus on their product roadmap.

Stable Video Diffusion is currently in a “research preview” stage, where interested users must agree to specific terms of use. These terms outline the intended applications of Stable Video Diffusion, such as educational and creative tools, as well as design and other artistic processes. However, it is important to note that factual or true representations of people or events are not among the intended uses.

Although this innovation from Stability AI holds immense potential, there are concerns about its potential misuse. Considering the history of similar AI research previews, including Stability AI’s previous releases, there is a possibility that the model could be exploited on the dark web. To counteract this, it is crucial for Stable Video Diffusion to have a built-in content filter to prevent abuse. The previous release of Stability AI’s model, Stable Diffusion, was misused to create nonconsensual deepfake adult content, highlighting the importance of content regulation.

Stable Video Diffusion consists of two models, namely SVD and SVD-XT. SVD can transform still images into 576×1024 videos with 14 frames, while SVD-XT increases the frames to 24. Both models have the capability to generate videos at a range of three to 30 frames per second.

According to the whitepaper accompanying Stable Video Diffusion, SVD and SVD-XT were trained on a dataset that included millions of videos. The training process involved fine-tuning the models on a smaller set of hundreds of thousands to one million clips. The origin of these training videos remains unclear, as does the inclusion of any copyrighted content. Stability AI and the users of Stable Video Diffusion could potentially face legal and ethical challenges if copyrighted material was used without permission.

Despite its limitations, which include the inability to generate videos without motion or slow camera pans, render text legibly, or consistently produce accurate depictions of faces and people, Stability AI is optimistic about the extensibility of the models. They have stated that the models can be adapted for use cases such as generating 360-degree views of objects.

Stability AI has ambitious plans for the future of Stable Video Diffusion. They are working on developing a range of models that will build upon and extend the capabilities of SVD and SVD-XT. Additionally, they are developing a web-based “text-to-video” tool that will allow users to prompt the models with text. The ultimate goal is commercialization, with potential applications in advertising, education, entertainment, and beyond.

