avatarRajesh Mani Kumar G

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3091

Abstract

sunset sky."</span>

<span class="hljs-comment"># Generate the image!</span> image = pipe(prompt=prompt)

<span class="hljs-comment"># Save your image</span> image.images[<span class="hljs-number">0</span>].save(<span class="hljs-string">"griffin_sunset.png"</span>)</pre></div><figure id="c5f8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*UGU8MId_9ijkhKtWGiXwGA.png"><figcaption></figcaption></figure><p id="cedc"><b>Sample Code 2:</b></p><p id="94b1">If you like to provide more parameters like image size, steps then try this. Model Files (Safetensors) or Sample Code can be downloaded from below URL of Huggingface.</p><p id="4362"><i>Safetensors is a model serialization format for deep learning models. It is faster and safer compared to other serialization formats like pickle.</i></p><div id="30c0"><pre>https:<span class="hljs-regexp">//</span>huggingface.co<span class="hljs-regexp">/stabilityai/</span>stable-cascade<span class="hljs-regexp">/tree/m</span>ain</pre></div><div id="2d64"><pre><span class="hljs-keyword">import</span> torch <span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> ( StableCascadeDecoderPipeline, StableCascadePriorPipeline, StableCascadeUNet, )

prompt = <span class="hljs-string">"4d photographic image of full body image of a cute little chibi boy realistic, vivid colors octane render trending on artstation, artistic photography, photorealistic concept art, soft natural volumetric cinematic perfect light, UHD no background"</span>

negative_prompt = <span class="hljs-string">"strange colours, blurry"</span>

prior_unet = StableCascadeUNet.from_single_file( <span class="hljs-string">"stage_c_lite_bf16.safetensors"</span>, torch_dtype=torch.bfloat16 ) decoder_unet = StableCascadeUNet.from_single_file( <span class="hljs-string">"stage_b_lite_bf16.safetensors"</span>, torch_dtype=torch.bfloat16 )

prior = StableCascadePriorPipeline.from_pretrained(<span class="hljs-string">"stabilityai/stable-cascade-prior"</span>, prior=prior_unet, torch_dtype=torch.bfloat16) decoder = StableCascadeDecoderPipeline.from_pretrained(<span class="hljs-string">"stabilityai/stable-cascade"</span>, decoder=decoder_unet, torch_dtype=torch.bfloat16)

prior.enable_model_cpu_offload() prior_output = prior( prompt=prompt, height=<span class="hljs-number">1024</span>, width=<span class="hljs-number">1024</span>, negative_prompt=negative_prompt, guidance_scale=<span class="hljs-number">4.0</span>, num_images_per_prompt=<span class="hljs-number">1</span>, num_inference_steps=<span class="hljs-number">20</span> )

decoder.enable_model_cpu_offload() decoder_output = decoder( image_embeddings=prior_output.image_embeddings, prompt=prompt, negative_prompt=negative_prompt, guidance_scale=<span class="hljs-number">0.0</span>, output_type=<span class="hljs-string">"pil"</span>, num_inference_steps=<span class="hljs-number">10</span> ).images[<span class="hljs-number">0</span>] decoder_output.save(<span class="hljs-string"

Options

"Little_Boy.png"</span>)</pre></div><figure id="24c0"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*2qps2FuNok-UoStHy9p_OA.png"><figcaption></figcaption></figure><p id="ac30"><b>Hint</b>: You can easily integrate this code with Streamlit or Flask for your own use-case. Try out and share your experience on comments.</p><p id="7cd6"><b>Pros and Cons: Weighing the Options</b></p><p id="733d">Stable Cascade’s architecture enables easy training and fine tuning on consumer hardware, making it accessible for researchers and developers. The model’s three-stage approach sets new benchmarks for quality, flexibility, and efficiency. Stability AI provides training and inference code on GitHub, allowing users to experiment with the model and customize its outputs. The model is available for inference in the diffusers library, offering features like image variations, image-to-image generation, inpainting, outpainting, canny edge generation, and 2x super resolution.</p><p id="a0ac">While Stable Cascade shines in many ways, it’s essential to consider its strengths and weaknesses:</p><p id="dd32"><b>Pros:</b></p><ul><li><b>Efficiency:</b> Generates images faster and with lower resource requirements compared to Stable Diffusion.</li><li><b>Accuracy:</b> Creates images that closely match the text prompt, including clear text rendering.</li><li><b>Customization:</b> Open-source nature allows for fine-tuning and experimentation.</li></ul><p id="5431"><b>Cons:</b></p><ul><li><b>Learning Curve:</b> Setting up and using Stable Cascade locally requires some technical knowledge.</li><li><b>Memory Usage:</b> While more efficient than Stable Diffusion, it can still require a GPU with sufficient VRAM for high-resolution images.</li><li><b>Newer Model:</b> Still under development, so expect ongoing improvements and potential changes in usage.</li><li><b>Non-Commercial License</b>: Stable Cascade is currently available under a non-commercial license, limiting its use for commercial applications.</li></ul><p id="161e"><b>Conclusion</b></p><p id="0448">Stable Cascade is a powerful tool for anyone interested in exploring the creative potential of Text-to-Image AI. Its efficiency, accuracy, and open-source nature make it a valuable addition to the AI art landscape. While a local setup requires some technical expertise, the potential rewards are vast. So, fire up your creative spirit and dive into the world of Stable Cascade!</p><p id="b05b">Stable Diffusion 3 (SD3) is stirring immense excitement among researchers and enthusiasts like myself who are eagerly anticipating its public release. Currently in the early preview phase with limited access to select individuals on the waitlist, the unveiling of SD3 has left many of us on the edge of our seats, fingers crossed for news of its official launch date from Stability AI. During this preview period, the team’s active engagement with users to fine-tune the model for wider availability only adds to the anticipation and eagerness surrounding the imminent release of SD3. Happy Learning!</p></article></body>

Stable Cascade: A Text-to-Image Gem from Stable Diffusion (Local Setup Info Included)

Stable Diffusion took the AI art world by storm with its ability to generate high-quality images from text descriptions. Now, Stability AI presents Stable Cascade, the successor that pushes the boundaries of Text-to-Image generation even further. Let’s delve into what makes Stable Cascade a gem and how you can leverage its power on your local machine.

What is Stable Cascade?

Stable Cascade builds upon the foundation of Stable Diffusion. Unlike the singular model of Stable Diffusion, Stable Cascade comprises three distinct stages — A, B, and C, allowing for hierarchical compression of images and achieving remarkable outputs through a highly compressed latent space It utilizes a novel architecture called “Würstchen” that achieves significant improvements in two key areas:

  • Efficiency: Stable Cascade boasts a much smaller latent space compared to its predecessor. This translates to faster inference times and lower computational costs, making it more suitable for resource-constrained environments.
  • Accuracy: Despite its efficiency, Stable Cascade excels in generating images that adhere closely to the provided text prompt. This is particularly evident in its ability to render text within the image itself, a feat that often challenged Stable Diffusion.

Local Setup: Unleash Your Creativity

The beauty of Stable Cascade lies in its open-source nature. You can set it up and experiment on your local machine! Here’s a basic example using the Diffusers library:

Create a new conda environment or python virtual environment and activate it. I am using a Laptop with NVIDIA GeForce GTX 1050 Ti 4GB GPU on Ubuntu 22.04.4 LTS OS.

Note: It will take more time and differ based on your system config.

conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install diffusers["torch"] transformers

Sample Code 1:

from diffusers import StableCascadeCombinedPipeline

# Load the pre-trained model
pipe = StableCascadeCombinedPipeline.from_pretrained("stabilityai/stable-cascade")

# Craft your text prompt
prompt = "A majestic griffin soaring through a vibrant sunset sky."

# Generate the image!
image = pipe(prompt=prompt)

# Save your image
image.images[0].save("griffin_sunset.png")

Sample Code 2:

If you like to provide more parameters like image size, steps then try this. Model Files (Safetensors) or Sample Code can be downloaded from below URL of Huggingface.

Safetensors is a model serialization format for deep learning models. It is faster and safer compared to other serialization formats like pickle.

https://huggingface.co/stabilityai/stable-cascade/tree/main
import torch
from diffusers import (
    StableCascadeDecoderPipeline,
    StableCascadePriorPipeline,
    StableCascadeUNet,
)

prompt = "4d photographic image of full body image of a cute little chibi boy realistic, vivid colors octane render trending on artstation, artistic photography, photorealistic concept art, soft natural volumetric cinematic perfect light, UHD no background"

negative_prompt = "strange colours, blurry"

prior_unet = StableCascadeUNet.from_single_file(
    "stage_c_lite_bf16.safetensors",
    torch_dtype=torch.bfloat16
)
decoder_unet = StableCascadeUNet.from_single_file(
    "stage_b_lite_bf16.safetensors",
    torch_dtype=torch.bfloat16
)

prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", prior=prior_unet, torch_dtype=torch.bfloat16)
decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", decoder=decoder_unet, torch_dtype=torch.bfloat16)

prior.enable_model_cpu_offload()
prior_output = prior(
    prompt=prompt,
    height=1024,
    width=1024,
    negative_prompt=negative_prompt,
    guidance_scale=4.0,
    num_images_per_prompt=1,
    num_inference_steps=20
)

decoder.enable_model_cpu_offload()
decoder_output = decoder(
    image_embeddings=prior_output.image_embeddings,
    prompt=prompt,
    negative_prompt=negative_prompt,
    guidance_scale=0.0,
    output_type="pil",
    num_inference_steps=10
).images[0]
decoder_output.save("Little_Boy.png")

Hint: You can easily integrate this code with Streamlit or Flask for your own use-case. Try out and share your experience on comments.

Pros and Cons: Weighing the Options

Stable Cascade’s architecture enables easy training and fine tuning on consumer hardware, making it accessible for researchers and developers. The model’s three-stage approach sets new benchmarks for quality, flexibility, and efficiency. Stability AI provides training and inference code on GitHub, allowing users to experiment with the model and customize its outputs. The model is available for inference in the diffusers library, offering features like image variations, image-to-image generation, inpainting, outpainting, canny edge generation, and 2x super resolution.

While Stable Cascade shines in many ways, it’s essential to consider its strengths and weaknesses:

Pros:

  • Efficiency: Generates images faster and with lower resource requirements compared to Stable Diffusion.
  • Accuracy: Creates images that closely match the text prompt, including clear text rendering.
  • Customization: Open-source nature allows for fine-tuning and experimentation.

Cons:

  • Learning Curve: Setting up and using Stable Cascade locally requires some technical knowledge.
  • Memory Usage: While more efficient than Stable Diffusion, it can still require a GPU with sufficient VRAM for high-resolution images.
  • Newer Model: Still under development, so expect ongoing improvements and potential changes in usage.
  • Non-Commercial License: Stable Cascade is currently available under a non-commercial license, limiting its use for commercial applications.

Conclusion

Stable Cascade is a powerful tool for anyone interested in exploring the creative potential of Text-to-Image AI. Its efficiency, accuracy, and open-source nature make it a valuable addition to the AI art landscape. While a local setup requires some technical expertise, the potential rewards are vast. So, fire up your creative spirit and dive into the world of Stable Cascade!

Stable Diffusion 3 (SD3) is stirring immense excitement among researchers and enthusiasts like myself who are eagerly anticipating its public release. Currently in the early preview phase with limited access to select individuals on the waitlist, the unveiling of SD3 has left many of us on the edge of our seats, fingers crossed for news of its official launch date from Stability AI. During this preview period, the team’s active engagement with users to fine-tune the model for wider availability only adds to the anticipation and eagerness surrounding the imminent release of SD3. Happy Learning!

AI
Generative Ai Tools
Artificial Intelligence
Art
Open Source
Recommended from ReadMedium