Consistent Cinematography with Midjourney: Cinematic Prompt Cheatsheet

A Guide To Creating Cinematic Images With Midjourney

Cinematic Prompt Anatomy

There are many ways to create images that have a “cinematic” look. Generally, we perceive images as “cinematic” if they contain artifacts of real-world filmmaking, such as certain camera angles and shot types, cinematic scene composition, lighting effects or character placement, etc.

This may sound trivial at first, but this general observation has some interesting implications when it comes to developing “cinematic prompts” with Midjourney. Basically, in order to make our images “cinematic” we want to show as many artifacts of traditional filmmaking as possible.

Here is a schematic drawing showing how a single prompt can contain all the necessary cinematic artifacts and how they relate to the corresponding filmmaking crafts.

A Prefix or Trigger is used to add context to Midjourney’s holistic prompt interpretations
A scene description includes (explicitly or implicitly) character placement, action, composition, mise-en-scène/set design, props, etc.
Style & shot descriptions include (explicitly or implicitly) camera & shot type, lighting, post-production, film material/quality, etc
By using parameters, aspect ratio, quality, and chaos can be defined.

It’s important to remember that these categories can’t be sharply differentiated, since they semantically overlap. For example, when using the same prompt (i.e. same prefix + same scene description), the usage of the word

“night” in the scene description implies specific lighting effects (e.g. “Backlighting”),
replacing it with “having a conversation” implies specific shot types (e.g. “Two Shot”, “Over-the-shoulder shot”), and
changing it to “futuristic room” implies specific set designs and color grading (“sterile” or “clinical” indoor set design, resemblance to genre reference works, e.g. “Space Odyssey”):

same prompt, only one expression changed in scene description (“night”, “having a conversation”, “futuristic room”)

Something to keep in mind when crafting scenes with Midjourney:

scene and style descriptions always overlap (we have to deal with interference when designing prompts)
the shorter the prompt, the more control; long prompts tend to entail redundant expressions or unindented overlaps
the longer the prompt, the less control but the more explorative potential
redundancy is necessary in some cases, e.g. “enforcement” (see: style/shot descriptions)

Based on my personal experience as a screenwriter using visual storytelling to create scenes based on a given narrative, there’s this rule of thumb: The more accurately you can describe a scene, the shorter the prompt gets — the shorter the prompt, the easier you can control it and make changes.

Let’s take a look at the individual elements of this cheat sheet for prompt design:

Prefixes or Triggers

These are pretty self-explanatory. We use them to give Midjourney a context for which medium we expect the rendering to be appropriate (in our case a cinema screen instead of canvas for a renaissance painting). The following expressions work well as prefixes/triggers to tell Midjourney to render in a cinematic context:

film still
cinematic still
cinematic shot
movie shot
footage from movie
movie footage
movie scene
cinematic movie scene
footage from [INSERT genre description] movie

etc.

They are interchangeable most of the time, as we can see here (same prompt, iterating prefixes):

“scene from movie” vs “award-winning movie scene”

If we modify Prefixes/Triggers further with …

black and white movie scene
sci-fi movie footage
etc.

“black and white movie scene” vs “sci-fi movie footage”

… we can achieve explicit (“black and white”) or implicit (“sci-fi” changes the color grading, not the costumes or setting) prompt effects.

Scene Descriptions

The effect of explicit/implicit expressions greatly influences scene descriptions. For example, compare these two pictures:

“sitting at a table in a kitchen” vs “domestic scene”

In the images above, the same prompt is used and the only change is the description of the scene (one explicitly uses “samurai sitting at a table in a kitchen”, and the other produces similar content implicitly with “domestic scene of a samurai”). Note also that “samurai” has cultural and historic implications for the depiction of the set, characters, and props.

Here’s another example: the same prompt (“film still” as prefix, “samurai” as scene description) but with a single word added to the scene descriptions causes overlapping with shot types, mise-en-scène, and lighting:

So, there are two things to be aware of:

The following elements are always included in a scene description (either explicitly or implicitly):

character description
description of action
set description/set design
scene composition
props

2. The scene descriptions elements generally overlap/interfer with the following shot/style elements:

lighting
location (interior/exterior)
color grading
shot type (camera angle, lens & shot size)

Are you a content creator, AI artist, or storyteller? Sign up here for my FREE WEEKLY newsletter “Tales Of Tomorrow” and learn everything about the latest tools & workflows for AI-powered content creation.

Style descriptions

In addition to scene descriptions, style descriptions are used to

enforce scene description (e.g. using “wide-angle” in addition to “mountains in the background” or using “volumetric lighting” in addition to a “nightly park scene” or using “over-the-shoulder shot” in addition to “having a conversation”, etc.),
alter scene description (e.g. using “style by sci-fi movie” to change lighting and color-grading, or using “atmosphere”, “vibe” or “symmetrical” to alter scene composition)
set color-grading, lighting, composition, shot type, camera angle, lens & shot size (I wrote about this here, here and here)

All of these can be prompted

directly, e.g. “8k, wide angle, black and white film”,
by reference, e.g. “style by 80s sci-fi movie”, “directed by stanley kubrick”
or as a mixture of both “style by 70s thriller movie directed by stanley kubrick, close-up, shallow depth of field”.

Examples

Here are some examples of how to use the above cheatsheet to create scenes and keep track of interferences from explicit/implicit scene/style descriptions (again: if you are exploring, you would probably not even want to be that precise with a prompt, but it can be very useful when fine-tuning scenes or working on the accurate depiction of elements from a given story).

Example 1: Switching shot types via style description

This rendering uses “samurai on a donkey” as scene description and “depth of field, style by 60s adventure movies, technicolor” as style description.

Adding “close-up” to the style description changes the shot size as intended since we have no interference from the scene description.

Example 2: Style (in)consistency

Generally, results are better, when style consistency is preserved. For example, adding the expression “8k” to the above example would not make sense since it destroys the 60s lighting and color-grading interference.

Instead, you would use “8k” in those style descriptions that prompt for a different visual style, for example as an addition to “depth of field, cinematic color grading, atmosphere” which already triggers a more modern, mainstream cinematography (e.g. “teal and orange” color-grading, etc.). In that case, “8k” fits the overall style description very well.

same prompt, without and with “8k” in style description “depth of field, cinematic color grading, atmosphere”

Example 3: Switching shot size via scene description & positioning

In the following example, the same prompt is used for all three images:

Prefix is “movie still”,
style description is “style by 2010 sci-fi, dramatic lighting”,
scene description: “samurai with a sword on a spaceship”)

We now want to imply shot size via scene description & positioning without any additional shot reference in the style description:

without & with “eyes” added at the end/beginning of the prompt

The first image has no additions, resulting in some wide-angle shots and a medium shot.
Image number two has “eyes” added at the end of the scene description, resulting in two medium shots and two close-ups.
Image number three has “eyes” added at the very beginning of the prompt (before the prefix), resulting in extreme close-ups.

Example 4: Enforcement by combining style & scene description

Here we use “samurai having a conversation” as scene description, which gives us the expected “two shot” / “over the shoulder” tendency.

If we want to change it to a wide angle shot, adding “wide angle” in the style description doesn’t really help:

Also, enforcing “wide angle” with repetition in both scene and style description only causes minor changes in shot size, because the expression “having a conversation” still maintains the “two shot” interference.

But we can enforce “wide angle” by adding an explicit background description (“mountains in the background”) in the scene description:

For more information about cinematography with Midjourney, follow me on Twitter or Medium (use my referral link to get full access to all my articles and those of thousands of other writers).

If you like my content, why not leave a “clap” at the end of this article, so more people can see it?

Also, feel free to check out the aforementioned three-part tutorial “Cinematography with Midjourney”:

Cinematography with Midjourney, Part 1: Anatomy Of A Cinematic Prompt (AI Image Generation)

Creating Cinematic AI Images with Midjourney

bootcamp.uxdesign.cc

Cinematography with Midjourney, Part 2: Mis-En-Scène & Lighting (AI Image Generation)

Creating Cinematic AI Images with Midjourney

bootcamp.uxdesign.cc

Cinematography with Midjourney, Part 3: Camera & Shot Types (AI Image Generation)

Creating Cinematic AI Images with Midjourney

bootcamp.uxdesign.cc

Join Medium with my referral link - Tristan Wolff

Read every story from Tristan Wolff (and thousands of other writers on Medium). Your membership fee directly supports…

medium.com