Like or Loathe Midjourney, Photographers Currently Have an Edge With It


[ad_1]

We all know that AI is changing every industry and photography is a long way from being exempt from that. What many may not know, however, is photographers have an edge when it comes to image-generating AI: Let me show you how.

When I wrote my last article on AI, I was concerned that people were experiencing burnout from AI news and debates, but the views said otherwise. So, here I am again. If you’re jaded with the AI topic, save yourself some time and click away, but if you’re not, let me flag to you something I haven’t seen spoken of anywhere else.

I have been using AI in various capacities for several years, but the past 18 months have given birth to a completely new breed of AI, more powerful than anything we’d seen before by lightyears. For the past 9 months or so, I have been experimenting with Midjourney, one of the premier image-generating AI software similar to OpenAI’s DALL-E (which I used before Midjourney) and Stable Diffusion (which I’ve barely used, but is held in high regard.) For the uninitiated, let’s do a quick summary.

What Is Midjourney?

Midjourney is a large language model (LLM) that utilizes AI to create images from text. By describing the image you want to see, Midjourney can generate results by using the enormous dataset of images it has. The text used to generate these images are called “prompts,” and they can be as simple or as complicated as you choose. While early versions of Midjourney were impressive, the most recent model version, 5.2, allows the creation of images indistinguishable from photographs.

The Edge Photographers and Videographers Have

The first thing to note is that anybody can get a photo-realistic image out of Midjourney, even with the most basic prompts. You might be surprised just how strong the results can be from prompts that are a few words. What many people misconstrue about this is that anyone can create anything, but that’s not necessarily true. What makes prompt-driven AI difficult is controlling the output. Yes, anybody could create a photo-realistic image of an elephant, but to have full control over the setting, the colors, the depth of field, the angle, the light, and so on, requires some know-how. Although it doesn’t pertain to Midjourney (but rather LLM such as ChatGPT and Bard), there is a reason why “Prompt Engineer” is the most in-demand new job with over 7,000 roles listed from June 2022 to June 2023, according to Workyard.

Now, having used many different AI software, I feel confident in saying that the skill ceiling for Midjourney is significantly lower than the likes of ChatGPT. Nevertheless, most people do not use text-to-image AI particularly well, just typing basic prompts and hoping to get lucky. You can improve this with various parameters, but where photographers and videographers have the advantage is using our expertise in the prompt.

Cameras and Lenses

Firstly, it has been proven that including cameras in your prompts can affect quality. It isn’t known how many images Midjourney has been trained on, but the general consensus is that it’s comfortably in the billions. When you include a camera in your prompt, it will likely find images taken with that camera (among many other images). In fact, some people found that simply adding H6D to the end of a prompt could yield higher-quality results. I suspect many doing this don’t even know that it refers to the $33,000 Hasselblad H6D medium format DSLR.

In my experience, which modern camera you choose doesn’t affect the final result all that much in terms of quality, though the sensor size of the camera does often affect depth of field. For example, the below images were identical prompts with the results varied only by camera; one was the Hasselblad H6D and one was the Fujifilm X100V. That is, one is a medium format sensor and one is an APS-C sensor.

What’s important here is not that the lighting changed, or some elements, or even the model — they’re par for the course when you regenerate. What’s interesting here is the depth of field. The background of the X100V image is far closer to focus than the medium format — this is accurate, and as photographers, we understand why this happened. So, using a combination of aperture and the camera, we can dictate the depth of field.

Settings

As I mentioned above, the aperture can be used to affect the depth of field of an image, just as it does in real life. If you want a narrow depth of field in your image, you want Midjourney trawling fast apertures. Although it is far from an exact science — primarily because Midjourney has no way of gauging the distance of the subject from the camera in the reference images — the results will be in the right direction at least. Below are two prompts for a headshot on the street, one I included f/1.4 in the prompt, and the other I included f/11 in the prompt instead.

You can see from the people on the left of the frame how much the aperture impacts the image, and you can see more extreme examples that this. Remember though, your words affect the depth of field too.

Terminology

So, words — rather expectedly — play a massive role and often overpower the settings you use in your prompt if they are at odds with one another (for example, a “cinematic headshot” at “f/18” isn’t going to give you a headshot with everything in focus.) If you type “snapshot of a man on the street” your depth of field will likely be wildly different to “editorial headshot of a man on the street.” Below are the results for exactly those two prompts.

What’s more, you don’t have to use photography terms logically for them to work well. One example would be “macro photography” added to any prompt that has nothing to do with macro photography. Those two words will often cause your results to have a narrow depth of field and a generally cinematic look. The below examples show just how much the term “macro photography” can improve the results.

Lighting

As every photographer and videographer knows, light is the be-all and end-all of our crafts. So, put that to work in Midjourney too. The average person doesn’t know lighting styles, but you can control the lighting in Midjourney by using them. As with every tip, remember Midjourney isn’t a simulator, and you’ll sometimes miss your target, but with some experimentation, you can control the output and look of generated images.

It wildly overcooked the eyes, but you can see how impactful it can be when you dictate the lighting.

Miscellaneous Tips

Remember, there is a lot of what will seem like randomness in the results, but really, we just don’t know all the interactions or Midjourney’s source material. Here are some photography-centric tips:

  • Midjourney can replicate film stocks quite well, so use them for a certain aesthetic
  • “Tilt-shift” sometimes works, but it often chooses a high point of view
  • “Color grading” tends to shoot for complementary and bold colors
  • “HDR” does exactly what HDR does most of the time
  • “Cinematic” often results in darker, low-key images
  • “8K” — lots of people add this to the end of prompts, but it causes the results to look fake and CGI in my experience
  • Obscure photography types such as “pinhole” or “infrared” often work well
  • Unusual lenses can work too if they’re well known enough, such as Lensbaby
  • “Low-key” and “high-key” do exactly what you’d hope
  • You can dictate the angle of the shot with “low-angle” or even “drone”
  • Not including something, such as lighting, doesn’t mean “no lighting”, it means “Midjourney, pick the lighting based on what I’ve said in this prompt”

The Ethical Elephant

I put some thought into whether I would add this section, but at this point, it’s a widely known functionality of Midjourney and other AI image generators, so I will address it. However, I have decided not to include any example images.

The “in the style of” component to prompts is arguably the most powerful influence on the look of the final image. This can be used ethically and to great effect, as I have shown above, with the likes of “in the style of a National Geographic photograph” or “in the style of a Vogue cover”, but by getting more specific, you tread on ethically difficult ground. For example, you could add to your prompt, “in the style of Annie Leibowitz,” and it will get you closer to her aesthetic. If you combine this with other details — which I am not going to provide — you can get to an image that I’m confident I could fool people into thinking is hers. These sorts of prompts make me uncomfortable, whether you’re referencing a photographer, an artist, or a DoP. This is also one thread to an entire rope of copyright issues surrounding AI image generation.

Final Thoughts

AI is a mixed bag for photographers; it’s powerful, valuable, and revolutionary, but it’s also scary, damaging, and legally uncharted. I resolved to practice using AI of all forms as part of my skill set, and while that is helping me in many ways, it’s also making me aware of where photographers are vulnerable. This is something that is spoken about regularly, so I thought I’d balance the scales a little with some of the advantages us ‘togs have with text-to-image generators such as Midjourney.

[ad_2]

Original Source Link


Leave a Reply