Learn To Control The Perspective

One of the more frustrating things about working with generative AI is how hard it can be to control the perspective of things. There are different things that will affect things like where an object is placed in the picture and how close or far the object appears.

While there is extensions like OpenPose that can be helpful, it’s always good to know how AI is working. I admit that I’m not an expert in any way when it comes to this, but I have learned a few tricks when it comes to positioning and perspectives of the objects you create with AI.

The basics on how to control the perspective

I’ll start with a very basic image that doesn’t have a complicated prompt and and doesn’t have a lot of objects in it.

Prompt

A woman standing in the middle of an empty room

Checkpoint: NightvisionXL
Sampling method: DPM++ 2M Karras
Sampling steps: 20
Size: 1024×1024
CFG scale: 7
Seed: 3148746371

Lets start by moving closer to the woman. The easiest way to accomplish that is to simply put “extreme close-up” or “close-up” at the front of the prompt.

Extreme close-up of a woman standing in the middle of an empty room

Now when she is closer I would like to turn her around so that she has her back towards the viewer. This is also a pretty easy and basic thing to do by simply adding “From behind” in the beginning of the prompt.

From behind. Extreme close-up of a woman standing in the middle of an empty room

To see the woman in profile we simply change “From behind” to “side view”.

Side view. Extreme close-up of a woman standing in the middle of an empty room

When it comes to turning the person on the image so she looks to the other way, it’s a bit more complicated. Some AI models are better at knowing right from left than others, so I’m not even going to bother with turning her around only using prompts. A quick way to fix this is by using ControlNet and Canny to turn the woman around.

With the help of Canny we create a kind of blueprint for how we want the image to turn out, and when the image is created it will use the blueprint as a guideline. The same prompt as was used above will work just fine.

Learn to control the perspective with Canny

Now let’s move the woman further away again, and turn her towards us. This can be achieved by the use of different prompts depending on the model you are using. By putting “long shot” first in the prompt we’ll move the woman further away.

Example:

Long shot, front view. A woman standing in the middle of an empty room

Example:

Long shot. A woman facing the camera standing in the middle of an empty room

It’s also possible to view objects from above by adding “from above” to the prompt.

Long shot from above. A woman standing in the middle of an empty room

And again, turning her around can be tricky depending on which model you use. It can be enough to just add “facing the camera” or “front view” in the prompt, but it might not get the result you want. Here’s a trick to get her to face the camera if “facing camera” and “front view” doesn’t work.

Long shot from above. A smiling woman standing in the middle of an empty room

By adding that we want a “smiling” woman to be the focus of the image, the AI model will have to turn to woman around. Because how can we see the smile unless we see her face?

We can change “from above” to “from below” and get an image where the “camera” is places slightly below the woman’s face.

Close-up shot from below. A smiling woman standing in the middle of an empty room

As you can see the “from below” view is not taken very far from below, and getting the image viewed from a lower angle can also be tricky. Some of the things that usually helps is to add several key words to describe what you want and emphasize them with brackets.

Medium shot, view from below. (low angle), (tilt view), (taken from ground level). A smiling woman standing in the middle of an empty room.

The “overhead shot” is similar to the “shot from above” but from a slightly lower angle.

Medium shot, overhead shot. A smiling woman in the middle of an empty room.

Because of the nature of the angles and perspectives from this point and on, we’ll move the woman outside. An interesting perspective to use is the fish-eye view, and it’s also very easy to use.

Fish-eye view of a smiling woman

The bird’s eye view is also interesting. While generating this next image I forgot to remove “smiling” from the prompt, which resulted in a cool close-up image taken from above.

Bird's eye view of a smiling woman

Next image will be bird’s eye view without the smile as well as making sure it will be taken outside.

Bird's eye view of a woman walking in a park

The opposing view would be “worm’s-eye view”, not that worms has any eyes as far as I know.

worm's-eye view. photo of a smiling woman with sunglasses in a park

Anamorphic photo should be used together with a widescreen aspect ratio for best result.

Anamorphic photo of a 1965 Cadillac driving through a park

The same prompt as above but without widescreen shows how the background gets a bit warped and distorted.

Aerial shot (left) and aerial shot with tilt-shift view (right).

Aerial shot. photo with depth of focus on a 1965 Cadillac in a park

Aerial shot, tilt-shift view. photo with depth of focus on a 1965 Cadillac in a park

The panorama photo, just like the anamorphic photo, will have the best results if you use a widescreen aspect ratio. For the image below I have used 21:9 aspect ratio.

Panoramic photo of  a winter lake with snow covered mountains in the horizon

Below you can see how the image turns out when using the exact same prompt but with an aspect ratio of 1:1.

These are, of course, not all angles and perspectives you can use in Stable Diffusion, and maybe there’s more extensive guides posted somewhere. But all these are basic things that you should know if you want to learn how to create great images by using AI.

If you don’t have Automatic1111 installed yet, you can install it using the guide I wrote.

Learn To Control The Perspective

The basics on how to control the perspective

Nyhetsbrev

Thank you!