Today’s experiment was an interesting one: Figure out how to get ChatGPT to generate prompts for MidJourney. Because GPT-4 often has surprising results with barely any additional context, the first step is always just to try something and see what happens. Presenting the Alan Thompson persona with the following prompt:
Please generate a MidJourney prompt >
This did, however, result in a somewhat expected misunderstanding:
Title: The Temporal Artifact
Prompt: Halfway through their journey, a group of time-traveling adventurers stumbles upon a mysterious artifact with the power to alter their perception of reality. As they delve deeper into its secrets, they must confront their own fears and insecurities while unraveling the artifact’s true purpose.
While this is not the intended result, it’s quite interesting how the request was interpreted (and likely will be used as a prompt for an Alan Thompson story in the near future). In one last attempt to get an untrained GPT instance to complete this task, a slightly more clear prompt was given:
Please generate a prompt for an ai image generator >
Title: Enigmatic Time-Travelers’ Realm
Prompt: Create an image depicting a futuristic time-traveling landscape, featuring a mysterious artifact glowing with ethereal energy, surrounded by the group of adventurers in awe. The scene should evoke a sense of wonder, with a surreal, otherworldly atmosphere filled with vibrant colors and futuristic technology.
Still not the correct result (MidJourney, and NightCafe, prefers less “conversational” prompts), so, clearly, some training will be required. It’s still an interesting prompt, so it was fed to MidJourney anyway:
It’s remarkably interesting, and really, more than passable. The problem with sending these conversational queries to MidJourney, however, is that they all start to get a little samey, which becomes more clear over time. For example, here’s another response from Alan Thompson:
Design an image that portrays a breathtaking oasis situated on an alien planet, complete with exotic flora, crystal-clear water, and radiant, bioluminescent plants. In the background, majestic mountains and a mesmerizing, multi-colored sky with multiple moons create an awe-inspiring atmosphere that showcases the beauty of the cosmos.
And the result from MidJourrney:
Still incredibly interesting and appealing, but the results can be better. MidJourney produces the best output when you use text modifiers and parameters to really dial in your artwork.
Getting ChatGPT to generate effective, killer MidJourney prompts
ChatGPT (Especially GPT-4) is incredibly “skilled” at handling, remembering, and processing based on context, and can remember “rulesets” and process requests based on previous communications in the conversation. This advantage can be used to effectively “dial-in” a personality in ChatGPT. Some notable examples are the Alan Thompson persona responsible for the two prompts already demonstrated and Penelope Ainsworth, a very loquacious, somewhat pretentious historical fiction “author,” both of which are quite effective at “staying in their lanes” with regards to the content they produce. Using similar methods, an image-prompt generator can be effectively created.
The process to get super-refined results is surprisingly involved, so this guide will aim to cover the basics, but still should be enough to get a good start on creating a new persona for generating interesting prompts.
An ideal prompt for MidJourney consists of three parts: image source (for remixing or pulling in “inspiration”) a text prompt for the desired result, and parameters that define the output of the image. An example prompt would look like this (though this one does not include image sources):
A cozy library filled with antique books, a warm fireplace, and comfortable armchairs::2 dimly lit::3 hushed whispers –seed 111222333 –style 4a –quality 1 –aspect 3:2
An explanation of the parameters is outside the scope of this article, but more information can be found in MidJourney’s documentation.
Step 1: Some rules.
After a considerable amount of trail and error, it’s been made quite clear that the more rules you give ChatGPT, the better the outcome. Surprisingly, being polite is also a good idea, but that’s a topic for another post. To get a good MidJourney ‘bot, start with a clear outline of the purpose:
In this conversation, i would like you to be an expert AI image prompt generator. Before we start i am going to give you a couple of lists >
ChatGPT will respond that it’s ready with varying amounts of exposition and niceties, and then it’s time to move on to defining some rules about prompts. ChatGPT is very good at remembering information, and can be instructed to use previously explained information in it’s responses.
Ok, in this first list, i will structure the items like this:
paramater <possible values>
for example, if the parameter is: seed <0-4294967295> then when you use “seed” you will render it like this:
–seed random number between 0 and 4294967295
do you understand?
Surprisingly, asking for confirmation is important here. GPT’s tendency to be helpful, often means it makes assumptions about what to do next, and it can start to do things like generating lists for you, or coming up with ideas. Asking for comprehension appears to force the engine to stay in the context of the prompt.
Next, we give it the list:
seed <0-4294967295>
quality <.25, .5, 1>
style <4a, 4b, 4c>
chaos <0-100>
aspect <1:2, 2:1, 1:1>
stylize <0-1000>
Thank you for providing the list of parameters and their possible values. Now I can generate image prompts using these parameters. Please let me know what kind of image prompt you’d like to generate, and I’ll create a prompt using the given parameters.
Ideally, a request should be made at this point to ensure that the engine understands the intended behavior of the list.
great, thank you! before we do that let’s make sure you have the right information. Please output a style
Sure, here’s a randomly selected style from the provided list:
style 4b
In this case, it did not fully grasp the context, so a quick refresher is in order:
please remember to output all “parameter” items with two leading dashes, like this: –style
GPT then expresses understanding, outputs an example and it’s time to move on. If desired, a request can be made to output all possible parameters and their values to check for comprehension. It should be correct, but if not, simply correct the chatbot and the results will update accordingly.
Now, it’s time to tell it how to generate the text prompts
This process can be as involved or as simple as desired. The output obtained from MidJourney will be directly related to the work that is put into generating the text prompts. This guide covers the basics of tuning, and describes how to go a bit further. The total number of prompts used to tune PlayingWithGPT’s AI image generator exceeded 50 (requiring two “wait” cycles in GPT4) but the results are worth it.
It’s important to think of processes like this as n additive process, rather than a destructive process. Meaning, it’s a lot easier to add context and “rules” to a conversation’s persona, instead of removing them.
For example, our first explanation of how to generate text prompts was as simple as:
Ok – great! Now, I’m going to describe what the text prompts should be like.
Keep in mind, that these are not a list of prompts to choose from, rather a set of examples of how prompts should be styled
When writing these prompts, you can separate coherent ideas using double colons, such as this ::
for example:
California:: dreaming:: is different than california dreaming
You can also add a random number between 1 and 4 after the :: to indicate how important that word is.
So – here’s a list of example prompts.
california dreaming summer time surfing
hot::2 dog:: portrait:: illustration
photograph of a 1920’s flapper:: black and white::2
ghibli style:: rocket launch in summer
Thank you for providing the guidelines and examples for creating text prompts. Based on your instructions, I will generate text prompts in the specified style, separating coherent ideas using double colons and adding a random importance value between 1 and 4. Here are some example prompts I have generated:
1) serene mountain landscape::3 waterfall::2 forest::1
2) fantasy::3 cityscape:: night sky::1 stars::4
3) tropical::2 beach:: sunset::4 silhouette:: palm trees
4) cyberpunk:: city streets::3 neon lights::1 rain::2
Please let me know if these prompts meet your expectations and if you would like me to generate more prompts or make any adjustments.
GPT understands what we are looking for, that’s excellent.
Step 2: put the rules together
Now that ChatGPT understands the rules, it needs to understand how to put them together. If the process has been completed correctly so far, a single prompt should be all that is required. (Please note that since the /input prompt command is inserted into the discord text entry field, it’s not necessary for the GPT client to include it, however for this guide it is included for clarity’s sake)
here’s how you will structure the completed prompt:
/imagine prompt <text prompt> <one to three parameters>
Understood! I will generate complete image prompts using the provided structure. Here are a few examples:
/imagine prompt underwater::3 tea party:: octopus::2 wearing a top hat::1 –style 4a –chaos 80 –stylize 540
/imagine prompt floating::4 islands:: interconnected by rainbow bridges:: llamas in space suits::2 –seed 123456789 –quality 1 –aspect 1:1
/imagine prompt cyberpunk:: city streets::3 neon lights::1 rain::2 –style 4c –quality 0.5 –chaos 50
/imagine prompt fantasy::3 cityscape:: night sky::1 stars::4 –seed 987654321 –style 4b –aspect 2:1
These prompts combine the text prompts with one to three parameters. If you need any adjustments or more prompts, feel free to let me know!
Step 3: Tuning
Simply running to MidJourney with the results above is fascinating enough:
A bit of tuning can improve our results:
Please feel free to include more complete thoughts or descriptions in the text prompts. For example: “a dog eating a skyscraper at sundown” or “a desert landscape at night::2 turbulent skies”. Also feel free to use exceptionally long descriptors too, such as “a crown of people, partying, enjoying themselves at night while a DJ plays dance music from the 80s”
Thank you for the guidance. I’ll generate text prompts with more complete thoughts, descriptions, and occasionally longer descriptors. Here are some new text prompts:
a curious cat exploring an ancient Egyptian tomb::2 golden artifacts
an astronaut playing chess with an alien on the moon::1 Earth in the background
a group of musicians performing on a floating stage in the middle of the ocean at sunset
a bustling city street market::3 filled with exotic fruits and spices:: colorful tents
Now, I’ll combine these with the parameters as before:
/imagine prompt a curious cat exploring an ancient Egyptian tomb::2 golden artifacts –seed 345987123 –style 4a –quality 0.5 –aspect 1:1
/imagine prompt an astronaut playing chess with an alien on the moon::1 Earth in the background –style 4b –chaos 40 –aspect 2:1
/imagine prompt a group of musicians performing on a floating stage in the middle of the ocean at sunset –seed 789123456 –style 4c –quality 1 –stylize 800
/imagine prompt a bustling city street market::3 filled with exotic fruits and spices:: colorful tents –quality 0.5 –chaos 70 –aspect 1:2
I hope these new prompts with more complete thoughts and descriptions better suit your needs. If you need any further adjustments or more prompts, feel free to ask!
These prompts, at this point, are fully functional and will generate interesting, intriguing results on MidJourney. Continuing to push the tuning, however results in responses like:
/imagine prompt A cozy library filled with antique books, a warm fireplace, and comfortable armchairs::2 dimly lit::3 hushed whispers –seed 111222333 –style 4a –quality 1 –aspect 3:2
and the results can be quite striking:
What sort of interesting results in MidJourney can you get from prompts generated via ChatGPT? Share them with us on twitter!