avatarJim Clyde Monge

Summary

Stability AI has released SDXL 1.0, an advanced open-source text-to-image model that improves upon its predecessors with enhanced color accuracy, contrast, lighting, and shadow effects, and better handling of complex image elements.

Abstract

Stability AI has announced the release of Stable Diffusion XL (SDXL) 1.0, the latest iteration of their flagship text-to-image model. This version is touted as the world's best open-source image generation model, showcasing superior color vibrancy, contrast, and an exceptional ability to render challenging concepts such as hands, text, and spatial compositions. SDXL 1.0 features a two-stage design with a 3.5B parameter base model and a 6.6B parameter refiner, allowing for robust image generation while maintaining speed suitable for consumer GPUs and cloud instances. Users can experience SDXL 1.0 through platforms like Clipdrop and DreamStudio, which offer new features such as style modification and negative prompts. The model is also accessible via API, with a cost-effective pricing structure. While the author acknowledges the improvements, they maintain that MidJourney's image quality remains superior and plan to conduct a comparative analysis.

Opinions

  • The author suggests that SDXL 1.0, while an improvement, still falls short in image quality compared to MidJourney.
  • The new AI model is praised for its robust image generation capabilities and compatibility with consumer hardware.
  • The author expresses skepticism about the effectiveness of SDXL 1.0, citing an example where the generated image did not accurately depict the intended scene.
  • The two-stage design of SDXL 1.0 is highlighted as a positive feature, ensuring image quality without significant performance trade-offs.
  • The author is optimistic about the API access and the affordability of using SDXL 1.0 for image generation.
  • The article hints at a future comparison between SDXL 1.0, SDXL 0.9, and MidJourney, indicating an interest in evaluating the progress of image generation models.

SDXL 1.0 Is Here — The Best Image Model From Stability AI

Image by Jim Clyde Monge

Stability AI, the startup popular for its open-source AI image models, has unveiled the latest and most advanced version of its flagship text-to-image model, Stable Diffusion XL (SDXL) 1.0. Following the limited, research-only release of SDXL 0.9, the full version of SDXL has been improved to be the world’s best open-source image generation model.

What’s new in SDXL 1.0?

Compared to the previous versions, SDXL 1.0 is particularly well-tuned for vibrant and accurate colors, with better contrast, lighting, and shadows.

According to its creators, SDXL exhibits the exceptional capability to generate concepts that traditionally challenge image models, including intricate elements like hands, text, and spatially arranged compositions.

Does it, really? I generated a sample image to demonstrate.

Prompt: A woman in the background chasing a dog in the foreground

Image by Jim Clyde Monge

Okay, the dog’s feet and the woman’s hand look accurate, but the image composition appears odd. The woman seems to be walking in the wrong direction, not chasing the dog. That’s not good.

SDXL 1.0 is more intelligent

Creating intricate, aesthetically pleasing images with SDXL requires only a few simple words, eliminating the need for additional qualifiers. It now understands subtle differences between prompts, enhancing output accuracy.

  • “The Red Square” — it will generate a popular place.
  • “Red square” — it will generate a shape in red.

Furthermore, SDXL 1.0 boasts one of the largest parameter counts among open-access image models, utilizing an innovative architecture comprising a 3.5B parameter base model and a 6.6B parameter refiner.

Stability AI

Its two-stage design ensures robust image generation without compromising speed, making it compatible with consumer GPUs and readily available cloud instances.

Try it yourself

The easiest way to try the latest SDXL 1.0 is via the Clipdrop website.

Image by Jim Clyde Monge

What’s great is that there is now a brand-new filter menu where you can modify the image style, aspect ratio, and add your negative prompts.

Image by Jim Clyde Monge

Another way to try SDXL 1.0 is through DreamStudio. Open the advanced settings page and set the Model to SDXL 1.0 (New).

Image by Jim Clyde Monge

Example images

Prompt: Magical realism, ghost riders in the sky

Image by Jim Clyde Monge

Prompt: Shot of vaporwave, fashion dog in miami

Image by Jim Clyde Monge

Prompt: Glowing jellyfish floating through a foggy forest at twilight

Image by Jim Clyde Monge

What do you think about these images? I think they look pretty cool, especially the jellyfish.

API access

Developers, rejoice! The new AI model is accessible via API. A 30 step image generation costs 0.0016 USD. That’s equal to 625 images for every 1 USD.

Image by Jim Clyde Monge

If you have an Amazon developer account, you can use SDXL 1.0 on AWS Sagemaker and AWS Bedrock.

Final Thoughts

Overall, I am happy to see Stability AI finally release the first official version of SDXL. But is it better than MidJourney? Well, not really. In terms of quality, MidJourney is still superior. In the coming days, I will do a side-by-side comparison of the generated images from MidJourney, SDXL 0.9, and SDXL 1.0.

Please support my work on Medium and get unlimited access by becoming a member using my referral link here. Have a nice day!

This story is published on Generative AI. Connect with us on LinkedIn to get the latest AI stories and insights right in your feed. Let’s shape the future of AI together!

Technology
Artificial Intelligence
Stable Diffusion
Sdxl
AI
Recommended from ReadMedium
avatarAnima Creative Ltd
AI Artist Stole My Work!!

Introduction

7 min read