Cinematography with Midjourney, Part 1: Anatomy Of A Cinematic Prompt (AI Image Generation)
Creating Cinematic AI Images with Midjourney
This is the first part of a 3-part series on cinematic prompts. Here’s what we’ll cover in this post:
- Anatomy of AI prompts for film stills
- Referencing cinematic style by era
- Referencing cinematic style by genre
- Referencing cinematic style by combinations of styles
- (Indirect) referencing of directors and cinematographers
Anatomy of a cinematographic image prompt
Cinematographic style refers to the visual techniques used by a cinematographer or director to create a particular look or aesthetic in a film. This includes visual elements such as camera angle, lens choice, lighting, color grading, and composition.
Film Stills
As is so often the case when creating images using AI, there is more than one way to do this. In this post we will focus on Midjourney and a basic prompt that helps us to explore how we can control certain aspects of rendering in a cinematographic style. Since words at the beginning of a Midjourney prompt seem to have more “weight” than those at the end, I usually start a cinematographic prompt with the prefix “film still”. The anatomy is like this:
/imagine prompt: film still, [scene description], [style description] — ar 3:2 [options]
(note: the “ — ar 3:2” part sets the aspect ratio to a more cinematographic one, unfortunately, Midjourney version 4 does not yet allow the use of 16:9 or 21:1)
The prefix “film still” instructs Midjourney to work towards a cinematic image composition right from the beginning without the need to explicitly describe camera angles, how people are arranged in the frame, depth of field, lighting, etc.
Of course, you could also drop the “film still” prefix altogether and still create great film footage — in fact, it would give you more freedom in terms of scene composition — but you would also have to add explicitly in which way you want Midjourney to establish a cinematic look.
A third option I see quite often is “footage from XY”, e.g. “footage from a 1973 science fiction film”. This can yield amazing results if you know exactly what you want in terms of cinematic elements (lighting, type of shot, staging, etc.), but with a strong prompt prefix like this at the beginning, you also limit the possible deviations from it that you want to achieve later when you start adding things (we’ll get to that later in this post and in parts 2 and 3).
So, let’s have a look at these basic ways of starting a cinematographic rendering with Midjourney (in all the following prompts we keep the seed value the same to be able to compare our results; you can read more about the seed command here)
First, without any film-related prefix:
domestic scene — ar 3:2 — seed 1000

Next, with “film scene” which basically just changes the composition and gives it a more photorealistic look (btw. something you can always add to any prompt with a quick reference like “8k, photorealistic”)
film scene, domestic scene — ar 3:2 — seed 1000

Now, two examples with “footage from X”, which immediately makes the image more cinematic but also forces a bunch of style elements from the genre you start with. This is perfect if you know exactly which genre you want to explore. Not so good though if you want to develop your image from characters or scenic elements.
footage from 1980s sci fi movie, domestic scene — ar 3:2 — seed 1000

In this next example, we can also see how “footage” can be interpreted literally and you end up with small monitors in your image. Of course, you can use such artifacts artistically, but for the exploration of cinematic composition it is rather annoying, because you would have to design your prompt to go against such a preposition.
footage from 1974 drama, domestic scene — ar 3:2 — seed 1000

Finally, “film still”, the prefix I prefer for this post because it offers both the most cinematic preposition and the most freedom for further stylistic exploration. I use two seed values in the example below to show the way variation is processed at a basic level; note how the cinematic look and composition are preserved)
film still, domestic scene — ar 3:2 — seed 1000
and
film still, domestic scene — ar 3:2 — seed 3000


Referencing Cinematic Style by Era
So with our basic prompt anatomy in place, let’s jump right into exploring cinematic styles. Again, there are many ways to reference a style. I use the “style by” prefix in the following examples for similar reasons I chose “film still”: it is both very solid (not easily overweighted in a long prompt) as well as very versatile (you can use it with references to either eras or genres AND use it with combinations).
Let’s start with a brief cycle through a “Street Scene” from the 1930s to the 2000s
film still, street scene, style by 1930s — ar 3:2 — seed 3000

film still, street scene, style by 1950s — ar 3:2 — seed 3000

film still, street scene, style by 1970s — ar 3:2 — seed 3000

film still, street scene, style by 1980s

film still, street scene, style by 1990s — ar 3:2 — seed 3000

Honestly, isn’t that amazing? How adequately can Midjourney reflect the zeitgeist of filmmaking? Of course, there is an obvious bias, that’s a well-known problem with all AI models these days. But that’s exactly what’s interesting about prompt elements like the “style by” prefix: robustness and combinability.
Here’s what I mean: an 80s movie from the streets of Tokio.
film still, street scene, style by 1980s and Tokio — ar 3:2 — seed 3000

Or a fictional blaxploitation movie from the 50s:
film still, street scene, style by 1950s and Blaxploitation — ar 3:2 — seed 3000

Referencing Cinematic Style by Genre
As the Blaxploitation example has already shown, another way to quickly refer to a particular cinematic style is by its genre. This is extremely effective. Look at what happens to the “street scene” above when we add “style of psychological thriller movie”:
film still, street scene, style by psychological thriller movie — ar 3:2 — seed 3000

Here’s another example with “a family scene”:
film still, a family scene, style by psychological thriller — ar 3:2 — seed 3000

Or:
film still, a family scene, style by psychological thriller — ar 3:2 — seed 3000

Well, Midjourney seems to understand the film-historical connection between “family” and “horror” and turns this prompt into an homage to a very famous black-and-white horror-comedy series centered on a family. :)
Let’s fix it like so:
film still, a family scene, style by 2010 horror movie — ar 3:2 — seed 3000

Okay, this is actually scary stuff. You’ll have to go through decades of horror movies yourself. Just combine “horror movie” and the decade with the prefix “style by”. Here’s an example of how this works. I am using sci-fi movies:
film still, a man aboard a spaceship, style by [decade] sci-fi movie — ar 3:2 — seed 1000




Referencing Cinematic Style With Combination of Styles
You can take the combination game much further than the examples above. For example, you could use art styles as an influence on scene composition and scene distortion. For example:
film still, a man in a hotel bar, style by thriller and minimalism — ar 3:2 — seed 3000

But you could also cross over to the cinematography of animation by mixing genre and art styles like so:
film still, a man in a hotel bar, style by thriller and cubism — ar 3:2 — seed 3000

Below is another example. Can you guess which art style I combined with “thriller”?

However, the limit of this approach is reached when referring to actual artists or directors. With a prompt like that, you can imitate the overall style of a movie by a certain director, but you will also eventually end up with a depiction of the director.
film still, a man in a hotel bar, style by wes anderson — ar 3:2 — seed 3000

You can weaken this effect by adding “directed by” before referring to directors and cinematographers. But keep in mind that this will always tend to include visual features of the directors/cinematographers when you have only one character in the scene. When there are multiple characters like in the above “street scene” prompts, the effect is not so strong.
film still, a man in a hotel bar, style by 1960s, directed by wes anderson — ar 3:2 — seed 3000

Besides referring to the style of existing artists raises some ethical questions, why not re-invent with AI what famous directors did with their tools (different lighting and camera types)?
If we get control over lighting, camera types, lenses, and shots, shouldn’t we be able to reconstruct certain styles with the right prompts? We’ll cover this in part 2 & 3 of the “Cinematic Prompt” series. Thanks for reading! 🙏






