ControlNET and Stable Diffusion: A Game Changer for AI Image Generation
New technology brings unprecedented levels of control to Stable Diffusion

ControlNet is revolutionary. With a new paper submitted last week, the boundaries of AI image and video creation have been pushed even further: It is now possible to use sketches, outlines, depth maps, or human poses to control diffusion models in ways that have not been possible before. Here’s how this is changing the game and bringing us closer to unlimited control of AI imagery and fully customized design:
Finally: In Control!
The revolutionary thing about ControlNET is its solution to the problem of spatial consistency. Whereas previously there was simply no efficient way to tell an AI model which parts of an input image to keep, ControlNet changes this by introducing a method to enable Stable Diffusion models to use additional input conditions that tell the model exactly what to do! Reddit user IWearSkin with an apt summary:

ControlNet Examples
To demonstrate ControlNet’s capabilities a bunch of pre-trained models has been released that showcase control over image-to-image generation based on different conditions, e.g. edge detection, depth information analysis, sketch processing, or human pose, etc.
For example, ControlNet’s Canny edge model uses an edge detection algorithm to derive a Canny edge image from a given input image (“Default”), and then uses both for further diffusion-based image generation:

In the same way, ControlNet’s HED model showcases control over an input image via HED boundary detection:

And here’s ControlNet’s pose detection model:

ControlNet’s Scribble model (casually enhances sketch-based diffusion as well):


ControlNET also works with the Stable Diffusion’s default masked diffusion. For example, the Canny Edge model can be used to control image manipulation with manual editing:

And these are just a few examples of the models presented in the original paper, which have already triggered the development of a new generation of toolkits for creators (interestingly, ControlNet casually got rid of “strange hands” already).
In addition, with spatial consistency solved, new advances in temporal consistency and AI cinema can be expected!





