Free AI web copilot to create summaries, insights and extended knowledge, download it at here

Abstract

"e244">Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic</blockquote>This is what SD3 came up with:<figure id="c061"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*X3DIOdcPjuGr8Ngn"><figcaption>Image generated with Stable Diffusion 3</figcaption></figure>This is a beautiful close-up photo. I love how it blurs out the sides and focuses on the tiger and the text. The detail on the embroidery is also fantastic. You can practically feel the scratchy texture of the stitching, plus the way it gets fuzzier as it recedes.Now here’s what the competitors’ result look like:<figure id="b2c4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*T0poYKidjP2RVk6c"><figcaption></figcaption></figure><figure id="9012"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*6IrTGwxBH4CK3kXm"><figcaption>Midjourney V6 and Dall-E 3 generated images</figcaption></figure>Midjourney, as expected, comes through with those signature dark tones and spot-on lighting. Dall-E 3 also did a great job with this one, although both of them added several other elements that are not in the prompt.Does that feel like cheating on anyone else? For me, SD3’s strict adherence to the prompt makes it the winner here… though I wouldn’t blame anyone for being smitten by Midjourney’s artistic interpretation.<h2 id="7b95">Prompt #3</h2><blockquote id="2d66">Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.</blockquote>Here’s how SD3 interpreted the prompt:<figure id="0c00"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*_hDc7OgR0iSJQkCc"><figcaption>Image generated with Stable Diffusion 3</figcaption></figure>The way the huge text blends with the background is so seamless. And notice how consistent the font is. It looks like an illustration that came straight out of Photoshop.Now let’s take a look at what Midjourney and Dall-E 3 came up with:<figure id="0a4d"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*nmYB_Icud3UqG_m5"><figcaption></figcaption></figure><figure id="8a8d"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*KaRhzIkdF5dWG8w9"><figcaption>Midjourney V6 and Dall-E 3 generated images</figcaption></figure>In terms of style and details, Midjourney is an easy pick. But the prompt asked for a very large text on the wall—Midjourney didn’t do that. So for coherency, SD3 is best here. Dall-E 3’s result is a mess.<h2 id="12bc">Prompt #4</h2><blockquote id="0074">Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.</blockquote><p id

Options

="2e9b">SD3 perfectly rendered what the prompt asked.<figure id="06f6"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*LaHiG7EmP79hqPYf"><figcaption>Image generated with Stable Diffusion 3</figcaption></figure>Here are the results from Midjourney and Dall-E 3:<figure id="7359"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*qyA0w3zyuZufWzQP"><figcaption></figcaption></figure><figure id="c407"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*RCH-44LXiHBbVw8V"><figcaption>Midjourney V6 and Dall-E 3 generated images</figcaption></figure>The images from Midjourney and Dall-E 3 were also impressive, with each generator presenting its own unique aesthetic qualities. Personally, I find Midjourney’s depiction of withered tables and rusty bottles especially captivating.That’s about it. To get more of a sense of what it can do, definitely check out the X profile of Stability AI’s CEO, <a href="https://twitter.com/EMostaque">Emad</a>. He frequently posts images generated with SD3.Honestly, where SD3 takes the crown is in its insanely accurate understanding of prompts. When you want exactly what you describe, it delivers. But Midjourney still holds its own when it comes to that unique, artistic expressiveness that can take images to the next level.What do you think of this new open-source model? Do you think it’s good enough to trump other competitors like Adobe Firefly, Amazon Titan G1, or Google’s Imagen 2?<h2 id="f776">Final Thoughts</h2>Looking at the sample images, it’s clear that SD3 excels in text generation and prompt coherency, arguably outperforming its biggest competitors. In terms of image quality, though, I still think Midjourney is the king.But why is this a big deal?Unlike its main competitors, Stable Diffusion is and has always been open-source, completely free to use! This means you might soon be able to download SD3 and run it locally at no cost. That freedom is massive.But there’s a huge downside—this also raises serious, and I mean serious, ethical concerns. Imagine the potential for even more convincing deep-fake images and videos.SD3’s tech also hints that its potential goes beyond images. Are we on the verge of competing with video generators like OpenAI’s Sora or LumaLab’s Genie? Now things get really interesting.<figure id="936b"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*roruncrRPZCpyH9S.png"><figcaption></figcaption></figure>This story is published on <a href="https://generativeai.pub/">Generative AI</a>. Connect with us on <a href="https://www.linkedin.com/company/generative-ai-publication">LinkedIn</a> and follow <a href="https://www.zeniteq.com/">Zeniteq</a> to stay in the loop with the latest AI stories. Let’s shape the future of AI together!<figure id="19ea"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*hTi-npTW9IDLyw7h.png"><figcaption></figcaption></figure></article></body>

Did Stable Diffusion 3 Just Beat Midjourney V6 and Dall-E3?

Stability AI just dropped a bombshell on the world of AI image generation: Stable Diffusion 3 (SD3). It’s currently the most advanced and capable image generator they have released, with massive upgrades from the already-impressive Stable Diffusion 2.1, released over a year ago.

So, what’s new in Stable Diffusion 3?

Text rendering support
Improved performance
Multi-subject prompts
Better image quality

SD3 also uses a diffusion transformer architecture, which is similar to OpenAI’s Sora. Does this suggest that SD3 is also capable of generating videos? This intriguing possibility is a discussion for another time.

In this article, I am going to compare the images generated by SD3 against two of the most popular and powerful AI image generators—Midjourney V6 and Dall-E 3.

Disclaimer: The Stable Diffusion 3 images I used in this article were generated by other people who have early access to the preview model.

Let’s get started.

Prompt #1

Photo of a red sphere on top of a blue cube. Behind them is a green triangle, on the right is a dog, on the left is a cat

Here is the image generated by Stable Diffusion 3:

The thing that jumps out of me is the green tint on the necks of the cat and the dog. It looks like SD3 is able to understand the environment and simulate reflections.

I can’t help but wonder if it secretly learned this trick from a dataset of behind-the-scenes photoshoots with green screens.

In contrast, the images from Midjourney V6 and Dall-E 3, while impressive, missed the mark in different ways.

Midjourney V6 and Dall-E 3 generated images

Midjourney V6 oddly placed the animals atop blue cubes, whereas Dall-E 3’s results were bewilderingly incoherent. SD3 takes this round hands-down for actually matching what the prompt asked for.

Prompt #2

Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic

This is what SD3 came up with:

This is a beautiful close-up photo. I love how it blurs out the sides and focuses on the tiger and the text. The detail on the embroidery is also fantastic. You can practically feel the scratchy texture of the stitching, plus the way it gets fuzzier as it recedes.

Now here’s what the competitors’ result look like:

Midjourney, as expected, comes through with those signature dark tones and spot-on lighting. Dall-E 3 also did a great job with this one, although both of them added several other elements that are not in the prompt.

Does that feel like cheating on anyone else? For me, SD3’s strict adherence to the prompt makes it the winner here… though I wouldn’t blame anyone for being smitten by Midjourney’s artistic interpretation.

Prompt #3

Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.

Here’s how SD3 interpreted the prompt:

The way the huge text blends with the background is so seamless. And notice how consistent the font is. It looks like an illustration that came straight out of Photoshop.

Now let’s take a look at what Midjourney and Dall-E 3 came up with:

In terms of style and details, Midjourney is an easy pick. But the prompt asked for a very large text on the wall—Midjourney didn’t do that. So for coherency, SD3 is best here. Dall-E 3’s result is a mess.

Prompt #4

Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.

SD3 perfectly rendered what the prompt asked.

Here are the results from Midjourney and Dall-E 3:

The images from Midjourney and Dall-E 3 were also impressive, with each generator presenting its own unique aesthetic qualities. Personally, I find Midjourney’s depiction of withered tables and rusty bottles especially captivating.

That’s about it. To get more of a sense of what it can do, definitely check out the X profile of Stability AI’s CEO, Emad. He frequently posts images generated with SD3.

Honestly, where SD3 takes the crown is in its insanely accurate understanding of prompts. When you want exactly what you describe, it delivers. But Midjourney still holds its own when it comes to that unique, artistic expressiveness that can take images to the next level.

What do you think of this new open-source model? Do you think it’s good enough to trump other competitors like Adobe Firefly, Amazon Titan G1, or Google’s Imagen 2?

Final Thoughts

Looking at the sample images, it’s clear that SD3 excels in text generation and prompt coherency, arguably outperforming its biggest competitors. In terms of image quality, though, I still think Midjourney is the king.

But why is this a big deal?

Unlike its main competitors, Stable Diffusion is and has always been open-source, completely free to use! This means you might soon be able to download SD3 and run it locally at no cost. That freedom is massive.

But there’s a huge downside—this also raises serious, and I mean serious, ethical concerns. Imagine the potential for even more convincing deep-fake images and videos.

SD3’s tech also hints that its potential goes beyond images. Are we on the verge of competing with video generators like OpenAI’s Sora or LumaLab’s Genie? Now things get really interesting.

This story is published on Generative AI. Connect with us on LinkedIn and follow Zeniteq to stay in the loop with the latest AI stories. Let’s shape the future of AI together!