avatarTristan Wolff

Summary

The website content provides a comprehensive guide on creating cinematic AI-generated images using Midjourney, focusing on the structure of prompts to achieve specific film styles and eras.

Abstract

The article is the first in a three-part series exploring the craft of cinematic prompt creation for AI image generation with Midjourney. It delves into the anatomy of a cinematic image prompt, discussing the importance of the prefix "film still" for setting a cinematic tone, and how the positioning of words in a prompt can influence the AI's output. The author explains how to reference cinematic styles by era and genre, and demonstrates the versatility of combining styles to achieve unique results. The piece also touches on the robustness and combinability of style elements, the potential for AI to reflect filmmaking zeitgeists, and the ethical considerations of mimicking the styles of real-world directors and artists. The article includes practical examples and seed values to illustrate the concepts, setting the stage for the subsequent parts of the series that will delve into lighting, mise-en-scène, and camera shot types.

Opinions

  • The author prefers the "film still" prefix for its cinematic preposition and freedom in stylistic exploration.
  • Using the "style by" prefix is recommended for its solidity in prompts and versatility in referencing eras or genres.
  • The AI's ability to reflect the zeitgeist of filmmaking is acknowledged, despite the inherent bias in AI models.
  • Combining "film still" with specific decades and genres can yield amazingly accurate results.
  • The author notes that referencing actual artists or directors can lead to the depiction of those individuals in the image, which may raise ethical questions.
  • The article suggests that with precise control over elements like lighting and camera types, AI could potentially reconstruct the styles of famous directors.
  • The use of the seed command is emphasized for consistency in results when experimenting with prompts.

Cinematography with Midjourney, Part 1: Anatomy Of A Cinematic Prompt (AI Image Generation)

Creating Cinematic AI Images with Midjourney

This is the first part of a 3-part series on cinematic prompts. Here’s what we’ll cover in this post:

  • Anatomy of AI prompts for film stills
  • Referencing cinematic style by era
  • Referencing cinematic style by genre
  • Referencing cinematic style by combinations of styles
  • (Indirect) referencing of directors and cinematographers

Anatomy of a cinematographic image prompt

Cinematographic style refers to the visual techniques used by a cinematographer or director to create a particular look or aesthetic in a film. This includes visual elements such as camera angle, lens choice, lighting, color grading, and composition.

Film Stills

As is so often the case when creating images using AI, there is more than one way to do this. In this post we will focus on Midjourney and a basic prompt that helps us to explore how we can control certain aspects of rendering in a cinematographic style. Since words at the beginning of a Midjourney prompt seem to have more “weight” than those at the end, I usually start a cinematographic prompt with the prefix “film still”. The anatomy is like this:

/imagine prompt: film still, [scene description], [style description] — ar 3:2 [options]

(note: the “ — ar 3:2” part sets the aspect ratio to a more cinematographic one, unfortunately, Midjourney version 4 does not yet allow the use of 16:9 or 21:1)

The prefix “film still” instructs Midjourney to work towards a cinematic image composition right from the beginning without the need to explicitly describe camera angles, how people are arranged in the frame, depth of field, lighting, etc.

Of course, you could also drop the “film still” prefix altogether and still create great film footage — in fact, it would give you more freedom in terms of scene composition — but you would also have to add explicitly in which way you want Midjourney to establish a cinematic look.

A third option I see quite often is “footage from XY”, e.g. “footage from a 1973 science fiction film”. This can yield amazing results if you know exactly what you want in terms of cinematic elements (lighting, type of shot, staging, etc.), but with a strong prompt prefix like this at the beginning, you also limit the possible deviations from it that you want to achieve later when you start adding things (we’ll get to that later in this post and in parts 2 and 3).

So, let’s have a look at these basic ways of starting a cinematographic rendering with Midjourney (in all the following prompts we keep the seed value the same to be able to compare our results; you can read more about the seed command here)

First, without any film-related prefix:

domestic scene — ar 3:2 — seed 1000

Next, with “film scene” which basically just changes the composition and gives it a more photorealistic look (btw. something you can always add to any prompt with a quick reference like “8k, photorealistic”)

film scene, domestic scene — ar 3:2 — seed 1000

Now, two examples with “footage from X”, which immediately makes the image more cinematic but also forces a bunch of style elements from the genre you start with. This is perfect if you know exactly which genre you want to explore. Not so good though if you want to develop your image from characters or scenic elements.

footage from 1980s sci fi movie, domestic scene — ar 3:2 — seed 1000

In this next example, we can also see how “footage” can be interpreted literally and you end up with small monitors in your image. Of course, you can use such artifacts artistically, but for the exploration of cinematic composition it is rather annoying, because you would have to design your prompt to go against such a preposition.

footage from 1974 drama, domestic scene — ar 3:2 — seed 1000

Finally, “film still”, the prefix I prefer for this post because it offers both the most cinematic preposition and the most freedom for further stylistic exploration. I use two seed values in the example below to show the way variation is processed at a basic level; note how the cinematic look and composition are preserved)

film still, domestic scene — ar 3:2 — seed 1000

and

film still, domestic scene — ar 3:2 — seed 3000

Referencing Cinematic Style by Era

So with our basic prompt anatomy in place, let’s jump right into exploring cinematic styles. Again, there are many ways to reference a style. I use the “style by” prefix in the following examples for similar reasons I chose “film still”: it is both very solid (not easily overweighted in a long prompt) as well as very versatile (you can use it with references to either eras or genres AND use it with combinations).

Let’s start with a brief cycle through a “Street Scene” from the 1930s to the 2000s

film still, street scene, style by 1930s — ar 3:2 — seed 3000

film still, street scene, style by 1950s — ar 3:2 — seed 3000

film still, street scene, style by 1970s — ar 3:2 — seed 3000

film still, street scene, style by 1970s — ar 3:2 — seed 3000

film still, street scene, style by 1980s

film still, street scene, style by 1990s — ar 3:2 — seed 3000

Honestly, isn’t that amazing? How adequately can Midjourney reflect the zeitgeist of filmmaking? Of course, there is an obvious bias, that’s a well-known problem with all AI models these days. But that’s exactly what’s interesting about prompt elements like the “style by” prefix: robustness and combinability.

Here’s what I mean: an 80s movie from the streets of Tokio.

film still, street scene, style by 1980s and Tokio — ar 3:2 — seed 3000

Or a fictional blaxploitation movie from the 50s:

film still, street scene, style by 1950s and Blaxploitation — ar 3:2 — seed 3000

Referencing Cinematic Style by Genre

As the Blaxploitation example has already shown, another way to quickly refer to a particular cinematic style is by its genre. This is extremely effective. Look at what happens to the “street scene” above when we add “style of psychological thriller movie”:

film still, street scene, style by psychological thriller movie — ar 3:2 — seed 3000

Here’s another example with “a family scene”:

film still, a family scene, style by psychological thriller — ar 3:2 — seed 3000

Or:

film still, a family scene, style by psychological thriller — ar 3:2 — seed 3000

Well, Midjourney seems to understand the film-historical connection between “family” and “horror” and turns this prompt into an homage to a very famous black-and-white horror-comedy series centered on a family. :)

Let’s fix it like so:

film still, a family scene, style by 2010 horror movie — ar 3:2 — seed 3000

Okay, this is actually scary stuff. You’ll have to go through decades of horror movies yourself. Just combine “horror movie” and the decade with the prefix “style by”. Here’s an example of how this works. I am using sci-fi movies:

film still, a man aboard a spaceship, style by [decade] sci-fi movie — ar 3:2 — seed 1000

Referencing Cinematic Style With Combination of Styles

You can take the combination game much further than the examples above. For example, you could use art styles as an influence on scene composition and scene distortion. For example:

film still, a man in a hotel bar, style by thriller and minimalism — ar 3:2 — seed 3000

But you could also cross over to the cinematography of animation by mixing genre and art styles like so:

film still, a man in a hotel bar, style by thriller and cubism — ar 3:2 — seed 3000

Below is another example. Can you guess which art style I combined with “thriller”?

However, the limit of this approach is reached when referring to actual artists or directors. With a prompt like that, you can imitate the overall style of a movie by a certain director, but you will also eventually end up with a depiction of the director.

film still, a man in a hotel bar, style by wes anderson — ar 3:2 — seed 3000

You can weaken this effect by adding “directed by” before referring to directors and cinematographers. But keep in mind that this will always tend to include visual features of the directors/cinematographers when you have only one character in the scene. When there are multiple characters like in the above “street scene” prompts, the effect is not so strong.

film still, a man in a hotel bar, style by 1960s, directed by wes anderson — ar 3:2 — seed 3000

Besides referring to the style of existing artists raises some ethical questions, why not re-invent with AI what famous directors did with their tools (different lighting and camera types)?

If we get control over lighting, camera types, lenses, and shots, shouldn’t we be able to reconstruct certain styles with the right prompts? We’ll cover this in part 2 & 3 of the “Cinematic Prompt” series. Thanks for reading! 🙏

Part 2: Lighting & Mise-en-scène

Part 3: Camera and shot types

Ai Art
Cinematography
Art
Artificial Intelligence
Screenwriting
Recommended from ReadMedium