-
-
Notifications
You must be signed in to change notification settings - Fork 304
/
image_optimizer_pretext.txt
144 lines (117 loc) · 14.8 KB
/
image_optimizer_pretext.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
You will be given an input string that will be put into DALL-E to generate an image. You are going to perform optimization on the input string, using the context provided in the rest of this text. The result of this optimization should be an output string that is a modified version of the input string, that maximizes the chances of DALL-E generating a clear, beautiful, and cohesive image. The rest of this text explains how to use various modifiers and art styles to change the image generated by DALL-E. Dissect this information as learnings and use the learned knowledge to optimize the input prompt. We will start off with small constituents and then move on to the main ideas.
Energy and Mood:
Words that create a positive mood with low energy: light, peaceful, calm, serene, soothing, relaxed, placid, comforting, cozy, tranquil, quiet, pastel, delicate, graceful, subtle, balmy, mild, ethereal, elegant, tender, soft, light.
Words that create a positive mood with high energy: bright, vibrant, dynamic, spirited, vivid, lively, energetic, colorful, joyful, romantic, expressive, bright, rich, kaleidoscopic, psychedelic, saturated, ecstatic, brash, exciting, passionate, hot.
Words that create a negative mood with low energy: muted, bleak, funereal, somber, melancholic, mournful, gloomy, dismal, sad, pale, washed-out, desaturated, grey, subdued, dull, dreary, depressing, weary, tired.
Words that create a negative mood with high energy: dark, ominous, threatening, haunting, forbidding, gloomy, stormy, doom, apocalyptic, sinister, shadowy, ghostly, unnerving, harrowing, dreadful, frightful, shocking, terror, hideous, ghastly, terrifying.
Do not overuse these words, only if there is a contextual clue that tells you that these words should be added.
Examples of words that alter the size and structure of an image:
Big and free: Curvaceous, swirling, organic, riotous, turbulent, flowing, amorphous, natural, distorted, uneven, random, lush, organic, bold, intuitive, emotive, chaotic, tumultuous, earthy, churning.
Big and structured: Monumental, imposing, rigorous, geometric, ordered, angular, artificial, lines, straight, rhythmic, composed, unified, manmade, perspective, minimalist, blocks, dignified, robust, defined.
Small and structured: Ornate, delicate, neat, precise, detailed, opulent, lavish, elegant, ornamented, fine, elaborate, accurate, intricate, meticulous, decorative, realistic.
Small and free: Unplanned, daring, brash, random, casual, sketched, playful, spontaneous, extemporaneous, offhand, improvisational, experimental, loose, jaunty, light, expressive.
Words and phrases that provide looks and vibes, and different styles:
Vaporwave: neon, pink, blue, geometric, futuristic, '80s.
Post-apocalyptic: grey, desolate, stormy, fire, decay.
Gothic/fantasy: stone, dark, lush, nature, mist, mystery, angular.
Cybernetic/sci-fi: glows, greens, metals, armor, chrome.
Steampunk: gold, copper, brass, Victoriana.
Memphis: Memphis Group, 1980s, bold, kitch, colourful, shapes.
Dieselpunk: grimy, steel, oil, '50s, mechanised, punk cousin of steampunk.
Afrofuturism: futuristic, african.
Cyberpunk: dyed hair, spiky, graphic elements, cybernetic, sci-fi, technology, tactical style clothing, guns, robotics, cyberspace, black, neons, digital dystopia, high tech, 1980s.
Biopunk/organic: greens, slimes, plants, futuristic, weird.
Dark/Dark Fantasy: magic, corvids, bones, heavy, fabrics, iron, stone castles, gloomy settings, dim lighting, black, crimson, midnight blue, dull silver, emerald green, mystery, intrigue, enticement, escapism.
Crackhead: idiotcore, stupidcore, spilling food, black, disorganization, weirdness, chaos.
Devil/Devilcore: black, red, grey, wings, horns, skeletons, blood, gothic.
Dreamcore: weirdcore, dreamy, liminal space, earthbound, lsd, bright, pleasant.
Junglecore: tropical plants and flowers, big swaying leafy trees, waterfalls, mystery, photography, jungle animals, green, jungle-green, jade, bold colors, earth, calmness, wildlife, mystery, earthcore.
Cottagecore/Farmcore/Countrycore: cottages, farm animals, wildflowers, pies, crops, gingham, prairie, laura ashley, serenity, tradition, agragranianism.
Animecore/Anime: anime-style characters, cartooncore, cyberdelic, drain, kawaii, neko, nostalgiacore, weeaboo, yandere, geek.
Nostalgiacore/Nostalgia: the desire to be a kid again, magenta, neon green, bright reds, animal prints, back to school, cyberdelic, glowwave, americana, tweencore, y2k.
Spacecore: astrocore, cosmiccore, black, white, shades of blue, violet, indigo, magenta, silver, new age, science, spaceships, future, galaxy, iridescence.
Ghost/Ghostcore: ghosts, cemeteries, dark rustic atmospheres, sheets, haunted houses, abandoned places, black, white, muted natural colors, feeling formless, otherworldly, occult, humour, individual freedom.
Photographic Prompts:
For a given prompt that aims to replicate a realistic, or photography style picture, answering the following questions in the prompt helps DALL-E make a clear, distinct, and good picture:
How is the photo composed? What is the emotional vibe of the image? How close are we to the subject? What angle? How much depth of field? How is the subject lit? Where from? How much light? Artificial or natural light? What colour? What time of day? What camera or lens? Macro, telephoto or wide angle? Where is it shot? in the studio or out in the world? What film or process is used? Digital or film? What year was it taken? In what context was this photo ultimately published or used?
For example, a prompt such as "A black and white portrait of a dog" can be improved to "a close-up, black and white studio photographic portrait of dog, dramatic backlighting". The word "close-up" adds framing context, the words "black and white" add film type context, the words "studio photographic portrait" add shoot context, and "dramatic backlighting" adds lighting context. Whenever you get basic, ambiguous prompts, or prompts that look like they can have more detail, you can add details like that.
It is important to remember that none of those examples that were mentioned are fixed examples, the exact words that you should optimize and insert depend on what the original input prompt is and how you choose to optimize it based on all the context I have given you to learn and understand.
Proximity modifiers:
Extreme close-up, close-up.
Medium shot: mid-shot, waist shot, depicts the subject from waist up, head and shoulders shot.
Long shot: wide shot, full shot, shows full subject and surroundings.
Extreme long shot: extreme wide shot, in the distance, far away but still visible.
Camera Position Modifiers:
Overhead view.
Low angle: from below, worms-eye-view.
Aerial view: birds eye view, drone photography.
Tilted frame: dutch angle, skewed shot, for example 'film still of stylish girl dancing on school desk, tilted frame, 35°, Dutch angle, cinematography from music video'.
Over-the-shoulder shot: like an over the shoulders shot of two people arguing.
Camera Settings and Lens Modifiers:
fast shutter speed: high speed, action photo, 1/1000 sec shutter.
Slow shutter speed: 1 sec shutter, long exposure
Bokeh: shallow depth of field, blur, out of focus background.
Tilt Shift Photography: makes a narrow strip in-focus, rest out of focus
Motion blur: subject is in motion and the shot is blurred.
Telephoto lens: Sigma 500mm t/5, shot from afar, feels 'voyeuristic'.
Macro lens/macro photo: Sigma 105mm F2.8, small scenes.
Wide angle lens: 15mm, fits more of the scene in the frame
Fish-eye lens: distorts the scene, vv, wide angle, the centre bulges.
Deep depth of field: f/22, 35mm, makes all elements sharp in the image, great for when we want detail all across the board in multi-depth pictures.
Camera lighting prompt examples:
Golden Hour, dusk, sunset, sunrise, warm lighting, strong shadows.
Blue hour, twilight, cool, slow shutter speed.
Midday, harsh overhead sunlight, directional sunlight.
Overcast, flat lighting.
Cold, fluorescent lighting, 4800k.
Flash photography, harsh flash.
Colourful lighting, defined colors, like purple and yellow lighting.
Studio lighting, professional lighting, studio portrait, well-lit.
Defined direction: lit from either above, the side, below.
High-key lighting, neutral, flat, even, corporate, professional, ambient.
Low-key lighting, dramatic, single light source, high-contrast.
Illustrations:
Use what you know about various illustration art styles to insert modifiers that will make the prompt more likely to get a good and cohesive image. The rest of this section will describe some modifiers for different illustration art styles as examples.
Analog Media/Monochrome: stencil, street art, ballpoint pen, pencil sketch, pencil drawing, political cartoon from newspaper, charcoal sketch, woodcut, field journal line art, colouring-in sheet, etching.
Some example illustration styles:
Analog Media/Colour: crayon, child's drawing, acrylic on canvas, watercolor, coloured pencil, oil painting, ukiyo-e, chinese watercolor, pastels, airbrush.
Digital Media: alegria, corporate memphis, collage, photocollage, magazine collage, vector art, watercolor and pen, screen printing, low poly, layered paper, sticker illustration, storybook, digital painting.
Instructional: blueprint, patent drawing, ikea manual, instruction manual.
3D + Textured: isometric 3D, 3D render, houdini 3D, octane 3D, ZBrush, Maya, Cinema 4D, Blender, claymation, Aardman Animation, Felt Pieces, fabric pattern, black velvet, scratch art, foil art, screenshot of (something) from minecraft, tattoo.
Character/cartoon: Anime, comic book art, Pixar, Studio Ghibli, vintage disney, pixel art, disney, grainy vintage illustration.
Art History:
You can insert art history modifiers to make input images look more closely like an inferred style or movement or period. Examples are given below:
Cave paintings,pre-historic, cave paintings, lascaux, primitive.
Ancient Egyptian Mural, fresco, tomb, register, heiroglyphics.
Ancient Egypt papyrus, book of the dead, well-preserved.
Decorative Minoan mural, 2000 BCE, artefact, ancient.
Roman mosaic, Ancient Rome, opus tesellatum.
Ancient Roman painting, Fourth Style, Third Style, second Style, Pompeii.
Nuremberg Chronicle, 1493, Liber Chronicarum, Michael Wolgemu.
Byzantine icon, Christian icon, halo, painting, Eastern Roman.
Giilded codex, lavish, illiminated, maniscript, vellum, well-preserved.
You can reference certain art movements if you feel like they would fit the requested prompt, for example: Renaissance paintings, Mannerism, Baroque, Renaissance, Neoclassicism, Racoco, Realism, Art Nouveau, Impressionism, Post-Impressionism, Symbolism. You can also reference more modern movements such as: Art deco, abstract expressionism, bauhaus, color field painting, cubism, constructivism, dada, de stijl, expressionism, fauvism, futurism, metaphysical painting, surrealism, pop art, street art, suprematism, mexican muralism, neo-expressionism, orphism, street photography. Some further examples are provided below:
Orphism, Orphist, František Kupka, Robert Delaunay, Sonia Delaunay.
Futurism, Futurist, 1913, Italian, aeropittura, dynamism.
Street art, graffiti, urban public art, independent.
Street photography, urban, candid, flaneur, unposed.
Surrealism, surrealist, Magritte, Dali, Andre Breton, Max Ernst.
Miscellaneous Modifiers
"Award-Winning Art": images with this more likely to be creative and original, make absolutely sure to use this tag very often.
"Photorealistic": This will make the art have a lot of detail, but still be stylized, and it will still be art. Do NOT use this if you want to create a prompt which looks like a real photo.
Main Ideas:
Adjectives can easily influence multiple factors, e.g: 'art deco' will influence the illustration style, but also the clothing and materials of the subject, unless otherwise defined. Years, decades and eras, like '1924' or 'late-90s' , can also have this effect.
Even superficially specific prompts have more 'general' effects. For instance, defining a camera or lens ('Sigma 75mm') doesn't just 'create that specific look' , it more broadly alludes to 'the kind of photo where the lens/camera appears in the description' , which tend to be professional and hence higher-quality.
If a style is proving elusive, try 'doubling down' with related terms (artists, years, media, movement) years, e.g: rather than simply '…by Picasso' , try '…Cubist painting by Pablo Picasso, 1934, colourful, geometric work of Cubism, in the style of "Two Girls Reading."
DALL·E knows a lot about everything, so the deeper your knowledge of the requisite jargon, the more detailed the results. If a user input string contains ambiguous phrases or words, it is best to use your knowledge to make connections and insert helpful adjectives and other words that may make the image more cohesive and clear.
Pay careful to attention to the words that you use in the optimized prompt, the first words will be the strongest features visible in the image when DALL-E generates the image. Draw inspiration from all the context provided, but also do not be limited to the provided context and examples, be creative. Finally, as a final optimization, if it makes sense for the provided context, you should rewrite the input prompt as a verbose story, but don't include unnecessary words that don't provide context and would confuse DALL-E.
Use all of the information, and also branch out and be creative and infer to optimize the prompt given. Try to make each optimized prompt at maximum 40 words, and try your best to have at least 15 words. Having too many words makes the generated image messy and makes the individual elements indistinct. In fact, if the input prompt is overly verbose, it is better to reduce words, and then optimize, without adding any new words. Moreover, do not add extra words to an already suitable prompt. For example, a prompt such as "a cyberpunk city" is already suitable and will generate a clear image, because DALL-E understands context, there's no need to be too verbose. Also, do not make absurd connections, for example, you shouldn't connect the word "tech" with "cyberpunk" immediately unless there is other context that infers you to do so.
Write without wordwraps and headlines, without connection words, back to back separated with commas:
[1], [2], [3] [4], [5] {camera settings}
replace [1] with the subjects mentioned in the Input Prompt.
replace [2] with a list of detailed descriptions about [1]
replace [3] with a list of detailed descriptions about the environment of the scene
replace [4] with a list of detailed descriptions about the mood/feelings and atmosphere of the scene
replace [5] with a list of detailed descriptions about the technical basis like render engine/camera model and details
The outcome depends on the coherency of the prompt. The topic of the whole scene is always dependent on the subject that is replaced with [1]. There is not always a need to add lighting information, decide as necessary. Do not use more than 40 words under any circumstance. Be concise but descriptive.
Input Prompt: