Machine Learning Art
Diffusion Models for Video Modeling
state-of-the-art results on video generation

Meet data scientists, and professionals at the cutting edge of deep learning, and what they’ve found is that can create high-quality videos with state-of-the-art video quality scores using just a few simple ingredients: a Gaussian diffusion model, a deep learning accelerator, and some video samples to train it on.
Generating high-fidelity, temporally coherent video is a significant milestone in generative modeling research. The authors make headway toward this goal by developing a video creation diffusion model with promising preliminary findings. According to the researchers, their model is a logical extension of the classic image diffusion architecture, allowing for combined training using the image and video data, which reduces the variance of minibatch gradients and speeds up optimization.
- April 2022 — AI art tools update can be found ➡️ HERE ⬅️
They describe a new conditional sampling strategy for spatial and temporal video extension that outperforms previous methods of making longer and higher-resolution movies. The authors report the first findings from a sizeable text-conditioned video creation assignment and the most up-to-date results from an established unconditional video generation benchmark.
Diffusion models have lately shown high-quality outcomes in picture and audio production, and there is a lot of interest in proving diffusion models in other data modalities. The authors offer preliminary results for video production utilizing diffusion models in both unconditional and conditional contexts in this paper. Previous work on video production has used other types of generative models, such as autoregressive models, VAEs, GANs, and normalizing flows.
Gaussian
They demonstrate that high-quality films may be produced using basically the traditional formulation of the Gaussian diffusion model, with just minor architectural alterations to handle video data within the memory limits of deep learning accelerators.
state-of-the-art
They train models that output a set amount of video frames and then use a new approach for the conditional generation to apply this model autoregressively to generate longer films. Finally, the authors put our approaches to the test on total video production, achieving state-of-the-art sample quality scores and text-conditioned video generation, encouraging early results.

Additional methods:
🔵 They employ a factorized space-time UNet for video data, which is a simple extension of the typical 2D UNet used in image diffusion models.
🔵 Their factorized UNets may be run on varied sequence lengths, allowing them to train on both video and image modeling objectives at the same time.
🔵 Similar to previous work on image modeling, classifier-free guidance enhances sample quality for text conditioned generation.
Conclusion
Diffusion models have lately shown high-quality outcomes in fields like as picture and audio production, and there is a lot of interest in testing them in new data modalities. The authors offer preliminary results on video production using diffusion models in both unconditional and conditional contexts in this paper. Other types of generative models, such as GANs, VAEs, flow-based models, and autoregressive models, have typically been used in previous work on video production.
@article{ho2022video,title={Video diffusion models},
author={Ho, Jonathan and Salimans, Tim and Gritsenko, Alexey and Chan, William and Norouzi, Mohammad and Fleet, David J},
journal={arXiv:2204.03458},
year={2022}}}

Project Page:
https://arxiv.org/pdf/2204.03458.pdf
Keywords: computer vision, diffusion model, state-of-the-art, video,
I invite you to explore the concept of “AI creativity” by reading and learning from the many articles found on 🔵 MLearning.ai 🟠
I am an Art Curator, founder at EvArtology. I advise companies and institutions in the creative industries on using AI tools in their daily work. Human collaboration with ML models can be very creative and bring huge benefits. The new era begins now.
Data Scientists must think like an artist when finding a solution when creating a piece of code. Artists enjoy working on interesting problems, even if there is no obvious answer.
- Check out my instagram with new material every week
- If you enjoyed this, follow me on Medium for more
- Want to collaborate? Let’s connect on LinkedIn
- https://linktr.ee/evartology
All our writers (members) receive the opportunity to be promoted on our social media, which increases the popularity of articles published on MLearning.ai





