AI Painting Popularization Course [2] Introduction to Vincentian Painting

2. Introduction to Vincentian diagrams and basic prompt words

What does AI painting have to do with magic and incantation? Remember, in the drawing process, there is an essential link, which is to input the prompt used to describe the picture to the AI. We also use it in the AI ​​chat. Because most of the prompt words for AI drawing works are written in English, which are very long and messy. And interspersed with all kinds of strange numbers and symbols, like enigmatic spells. Therefore, everyone vividly calls this process of writing prompt words "mantra chanting". And we, like those magicians, have to chant spells to achieve the results we want. Although AI is artificial intelligence, there is still a certain gap between it and real human intelligence. Many times it does not know what you want. Therefore, detailed prompt words are needed to help you better direct AI drawing. This is why, with the popularity of AI painting today, the matter of "mantra chanting" has gradually formed a unique knowledge that can be discussed and studied.

The content explained in this course uses Stable Diffusion, but I know that many friends use MidJourney, another very popular AI painting application. The large logical frameworks of these frameworks are actually universal, and MidJourney is more reliable than SD. The prompt words are used to create works, and the application level of the prompt words is more in-depth.

1. Basic concepts of prompt words

abstract:

  • The concept and basic logic of prompt words
  • Prompt word syntax (input, spacing)
  • Content-based prompt words and standardized prompt words

In this lesson, we are going to touch on the text-generated image function in StableDiffusion, which is to generate images through text. The "text" here naturally refers to the prompt word Prompt. Broadly interpreted, Prompt refers to the text or image information input by the user, with the purpose of guiding the model to generate works of art according to some specific needs. To put it bluntly, it is a language we use to tell AI "what I want to draw" and "how to draw it".

In the last class, we also mentioned that there are two basic ways of drawing in SD, namely Vincentian diagrams and Tu-generated diagrams. Vincentian diagrams mainly use words to achieve this communication process, while Tu-generated diagrams can also rely on pictures. convey information, but there are also cue words in the graph, and they are just as important. The content included in the prompt words is also very broad. It may include the theme of the work, painting style, image characteristics and some specific elements.

For example:

正面提示词:
(1girl:2.0), solo, nilou \(genshin impact\), solo, long hair, jewelry, blue gemstone, earrings, horns, crown, cyan satin strapless dress, white veil, neck ring, red hair, {green eyes}, ((full body)), (SFW:1.5), front, highly detailed face, curvy body, skindentation, hands up, happy smily face, pureerosface_v1, hiqcgbody,

{
   
   {masterpiece}}, {best quality},{highresl}, original, reflection,Exaggerated body proportions, greasy skin, realistic and delicate facial features, depth of field, extremely detailed CG unity 8k wallpaper, bloom, shine, (illustration), (painting), (sketch), anime coloring, fantasy, unreal engine, body shadow, artstation

indoor, sitting on sofa, sunset, backlighting, shiny skin, lens flare, light particles, glowing, dappled sunlight, extreme shadow and light, long shadow, light rays, sun
wind blow, maple leaf, cloudy sky, dusty, forest, plant, flower

反面提示词:
EasyNegative, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)),((grayscale)),skin spots, acnes, skin blemishes, age spot, (outdoor:1.6), (ugly:1.331), (duplicate:1.331), (morbid:1.21), (mutilated:1.21)(tranny:1.331), mutated hands, (poorly drawn hands:1.5), blurry, (bad anatomy:1.21), (bad proportions:1.331), extra limbs, (disfigured:1.33), (more than 2 nipples:1.331), (missing arms:1.33), (extra legs:1.331), (fused fingers:1.61051),(too many fingers:1.61051), (unclear eyes:1.331), lowers, bad hands, missing fingers, extra digit, (futa:1.1), bad hands, missing fingers, (((extra arms and legs))), extra hair, (2girls:2.0), signature, watermark, username, atist name

Different prompt words describe the picture style, character appearance, clothing characteristics, scene content and some additional decorative elements to the AI. Don't look at so many words. In fact, many of the reminder words about style and image quality control are fixed.

Although more prompt words are not always better, writing more prompt words will definitely produce better results than writing less prompt words, and the control on specific needs will be more precise. So, if we want AI to produce pictures according to our needs, how should we write the prompt word? Relax. In fact, the process of writing prompt words is very free. No matter what you write, AI can draw it for you. In StableDiffusion, the area where prompt words can be input is the two text boxes on the upper left. As mentioned before, It is divided into upper and lower parts. The upper part is the forward prompt word, and the lower part is the reverse prompt word. Although free, prompt words have some basic grammar rules that you should master.

First of all, the prompt words need to be written in English, so if your English is good enough, you can organize your description language in English. If not, you can ask for help from translation software.

Secondly, prompt words are based on phrases as units. They do not need to have complete grammatical structures such as main clauses and the like like real English sentences. Just like if you tell the AI: "Draw a long and wide surface and a large and round bowl", it can be directly broken down into (surface, length, width), (bowl, big, round), so that the AI You can also understand it, and sometimes you can even understand it better than the previous one.

A delimiter needs to be inserted between phrases. The basic delimiter format is a half-width comma in English. When entering prompt words again, it is best to switch the input method to English. Because the symbols involved are basically in English, the prompt words can be wrapped in new lines, but it is best to put a separator at the end of each line.

1girl, walking, forest, path, sun, sunshine
1girl, walking, forest, path, sun, sunshine,
shining on body,

After entering these things, click "Generate" to generate a picture. The resulting picture looks like this:

00006-1663017696

It may fit your needs, or it may be weird. AI painting has a certain degree of randomness. If you click more and more times to generate it, the things generated will be different each time. Previously, some people compared AI painting to "drawing cards". To come up with good pictures, you have to rely on luck. "A girl walking in the forest" is actually just a very general description. What does the girl look like, what is in the forest, whether it is morning or evening, and what the weather is like. The AI ​​does not know these things. Your tips If the word is too general, then it can only be used to draw cards. But don’t worry, the prompt words are often not written all at once, but are already a prototype, and then slowly refined, supplemented and fine-tuned.

1girl, walking, forest, path, sun, sunshine, shining on body, looking at viewer, close-up, upper body,
(masterpiece:1,2), best quality, masterpiece, highres, original, extremely detailed wallpaper, perfect lighting, (extremely detailed CG:1.2), drawing, paintbrush

You can also add some prompt words to control the screen in some specific aspects.

00007-2658747475

2. Classification and writing methods of prompt words

What to add? There are many different categories of prompt words. Here, the prompt words are summarized into the following categories.

(1) Character and theme characteristics

  • Clothing: white dress
  • Hair color: blonde hair, long hair
  • Facial features: small eyes, big mouse
  • Facial expression: smiling
  • Body movements: stretching arms

For example: girl, white dress, blonde, long hair, smile, stretch arms, raise hands, beautiful, happy

1girl, walking, forest, path, sun, sunshine, shining on body, 
white dress, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy,

00000-1611124299

(2) Scene characteristics

  • Indoor, outdoor: indoor/outdoor
  • Big scene: forest, city, street
  • Small details: tree, bush, white flower

For example: trees, shrubs, white flowers, (forest) paths

1girl, walking, forest, path, sun, sunshine, shining on body, white dress, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy,
trees, bush, white flower, outdoor,

(3) Ambient lighting

  • Day and night: day / night
  • Specific time periods: morning, sunset
  • Light environment: sunlight, bright, dark
  • Sky: blue sky, starry sky

For example: daytime, sunshine, blue sky, cloudy sky

1girl, walking, forest, path, sun, sunshine, shining on body, white dress, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy,
trees, bush, white flower, path, outdoor,
day, sunlight, blue sky, cloudy sky,

(4) Supplement: Frame perspective

  • Distance: close-up, distant
  • Character proportions: full body, upper body
  • Observation perspective: from above, view of back
  • Lens type: wide angle, Sony A7

For example, close-up:

1girl, walking, forest, path, sun, sunshine, shining on body, white dress, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy,
trees, bush, white flower, path, outdoor,
day, sunlight, blue sky, cloudy sky, close-up

These four categories can be called "content-type prompt words". However, with only content-type prompt words, there is a high probability that what you draw will not satisfy you. Here, we need to introduce other prompt words to give this The picture is a shot in the arm.

The first is picture quality, because among the pictures learned by AI, some are high-definition and some are of blurry quality. We can use such prompt words to let him stare at which high-definition ones.

(5) Image quality prompt words

  • Universal high-definition: best quality, ultra-detailed, masterpiece, hires, 8k
  • Specific high-resolution types: extremely detailed CG unity 8k wallpaper (ultra-fine 8K Unity game CG), unreal engine rendered (unreal engine rendering)

For example: highest quality, super detailed, masterpiece, high resolution, 8K (resolution), ultra detailed Unity CG wallpapers

1girl, walking, forest, path, sun, sunshine, shining on body, white dress, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy,
trees, bush, white flower, path, outdoor,
day, sunlight, blue sky, cloudy sky, close-up,
best quality, ultra-detailed, masterpiece, hires, 8k,extremely detailed CG unity 8k wallpaper, unreal engine rendered

(6) Painting style prompt words

  • Illustration style: illustration, painting, paintbrush
  • 2D: anime, comic, game CG
  • Department of Realism: photorealistic, realistic, photograph

For example: painting, illustration, animation, game CG

1girl, walking, forest, path, sun, sunshine, shining on body, white dress, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy,
trees, bush, white flower, path, outdoor,
day, sunlight, blue sky, cloudy sky, close-up,
best quality, ultra-detailed, masterpiece, hires, 8k,extremely detailed CG unity 8k wallpaper, unreal engine rendered
painting, illustration, anime, game CG

(5) and (6) can be called standardized prompt words.

(7) Prompt word template

描述人物:
(1girl:2.0), solo, nilou \(genshin impact\), solo, long hair, jewelry, blue gemstone, earrings,horns, crown, cyan satin strapless dress, white veil, neck ring, red hair, {green eyes},

描述场景:
indoor, room, house, sofa, wooden floor, plant, flowers, trees, windows,

描述环境(时间、光照):
day, morning, sunlight, dappled sunlight, backlight, light rays, cloudy sky

描述画幅视角:
full body, wide angle shot, depth of field

其他画面要素:
light particles, fantasy, wind blow, maple leaf, dusty,... (其他往后增加)

高品质标准化:
{
   
   {masterpiece}}, {best quality}, {highres}, original, reflection, unreal engine, body shadow, artstationextremely detailed CG unity 8k wallpaper

画风标准化:
(illustration), (painting), (sketch), anime coloring, fantasy,

其他特殊要求:
exaggerated body proportions, greasy skin, realistic and delicate facial features, SFW

3. Weight and negative prompt words

(1) Weight of prompt words

The following mantra,

SFW, 1girl, walking, forest, path, sun, sunshine, shining on body,yellow skirt and white t-shirt, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy.trees, bush, white flower, path, outdoor,
(masterpiece:1.2) , best quality, masterpiece, highres, original, extremely detailed wallpaper, perfect lighting(extremely detailed CG:1.2), drawing, paintbrush.

You will see a lot of (), decimal points, commas and other symbols and numbers. What do they do? In fact, these contents are used to enhance or weaken the priority and weight of certain prompt words. For example, when we saw this picture just now, we entered it, but we did not clearly see the white flowers in the white flowerpicture. You You input a lot of different elements to the AI ​​and ask it to draw it, but when it is processing it, it may not get what you want most, so it may give priority to trees and forests, if you really want it. For Hundred Flowers, you can use a similar method to enhance the weight and priority of White Flowers. There are two ways to enhance it:

  • set of brackets

    ()This will increase the weight to 1.1 times, making it more prominent compared to other elements. You can also set multiple levels of brackets. Each set of brackets will be multiplied by 1.1 times. Three levels will be 1.331 times. This is how the flowers appear:

    SFW, 1girl, walking, forest, path, sun, sunshine, shining on body,yellow skirt and white t-shirt, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy.trees, bush, (((white flower))), path, outdoor,
    (masterpiece:1,2) , best quality, masterpiece, highres, original, extremely detailed wallpaper, perfect lighting(extremely detailed CG:1.2), drawing, paintbrush.
    
  • Parentheses plus numerical weight

    After adding a kind of brackets, you can add an English colon at the end and a number after it.

    SFW, 1girl, walking, forest, path, sun, sunshine, shining on body,yellow skirt and white t-shirt, blonde hair, long hair, smiling, stretching arms, hands up, beautiful, happy.trees, bush, (white flower: 1.5), path, outdoor,
    (masterpiece:1,2) , best quality, masterpiece, highres, original, extremely detailed wallpaper, perfect lighting(extremely detailed CG:1.2), drawing, paintbrush.
    

    Therefore, when you feel that there is something in this picture that you told the AI ​​but it did not draw, you can use these methods to emphasize it. The way of adding numbers is obviously more accurate, while adding parentheses is more convenient for fine-tuning.

    Compared with parentheses, there are also curly brackets {}, which represent 1.05 times, and the adjustment effect is a little subtler.

  • weaken prompt word

And if you want to weaken the influence of a certain prompt word, you can give it a weight value less than 1, or use square brackets [], which will weaken the original weight to 0.9 times its original value. When adjusting the weight, you should also pay attention to one thing, which is to try to avoid the weight of individual entries being too high. The safe range in my experience is around 1 plus or minus 0.5. When you give an entry a value of around 2 or even higher, it will easily distort the content of the picture. At this time we usually have to change our thinking. Help enhance its effect through more types of terms.

in conclusion:

image-20230801063233335

  • Advanced grammar

image-20230801063519417

(2) Negative cue words

Another component of prompt words is negative prompt words. In layman's terms, if you want something to appear in this picture, just throw it into the forward prompt word, and if you don't want it to appear, throw it into the reverse prompt word. There is no need for reverse prompt words, but generally we will choose to add some common projects, mainly based on standardization considerations, such as:

NSFW, (worst quality:2) , (low quality:2), (normal quality:2) , lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, (ugly:1.331), (duplicate:1.331), (morbid:1.21), (mutilated:1.21), (tranny:1.331), mutated hands, (poorly drawn hands:1.5), blurry, (bad anatomy:1.21), (bad proportions:1.331), extra limbs, (disfigured:1.331), (missing arms:1.331), (extra legs:1.331), (fused fingers:1.61051), (too many fingers:1.61051), (unclear eyes:1.331), lowers, bad hands, missing fingers, (((extra arms and legs))),

(3) Detailed explanation of drawing parameters

If the prompt word is a spell, then the drawing parameters are like a magician's wand and magic book, empty of the specific release form of this spell.

  • Sampling iteration steps

    The image generated by AI will go through a process of adding noise and then denoising, and denoising is to use pixels to simulate the image you will eventually generate. Every time you simulate it, the picture will become clearer. Theoretically, the more sampling iteration steps, the clearer the final effect will be, but in fact, when the number of steps is greater than 20, the subsequent improvement will not be significant. And increasing the number of steps definitely means longer calculation time. Therefore, the default number of sampling steps is generally 20. If you have sufficient computing power and want to pursue higher resolution, set it to 30-40, and the minimum should not be lower than 10.

  • Sampling method

    The sampling method can actually be simply explained as a specific algorithm used by AI to generate images. WebUI provides many sampling methods, more than a dozen, but among them, we estimate that only 4-5 are commonly used. some of:

    Euler's two suitable illustration styles, the illustrations are relatively simple.

    DPM 2 and DPM2 Karras are faster.

    When using it specifically, I recommend using ++the ones marked with . These are improved algorithms and are more stable than the ones above anyway.

    In addition, most models also recommend the use of a specific algorithm, which may have been tested by the model maker themselves. For example, the author of Abyss Orange most recommends SDE Karras.

  • width and height

    It represents the resolution when you finally produce the picture. There are some implicit limitations in the resolution setting. The default resolution is 512 x 512, but pictures at this resolution, no matter how detailed they are, will look blurry. of. Equipment permitting, I would raise it to about 1,000.

    The same prompt words are run with a higher resolution, and the texture is completely different. However, if the resolution is set too high, there will be problems. First, the video memory of your graphics card cannot handle it; second, the resolution is too high, and it is easy for multiple people, hands, and feet to be used. I have done special research on this issue. The reason is that when AI performs simulation training, the resolution of the pictures used is generally relatively small. If your resolution setting is too large, it will think that you have multiple pictures. It's made up of spliced ​​pieces, so it's not surprising that there are so many people there.

    To avoid such problems, we generally use low-resolution rendering first, and then rely on high-definition repair (Hires Fix) to enlarge. It itself is actually an additional drawing. You can also experiment repeatedly to understand what resolution can ensure both quality and efficiency under your current equipment conditions.

  • Facial restoration

    Facial repair generally needs to be checked. It will use some adversarial algorithms to identify people's faces and repair them. It is similar to the smart P-face function in the Meitu App we use.

  • Tiling/Tiling

    Tiling is used to generate textured images that can seamlessly cover the entire screen. If you don't need it, don't check it. It will also make your picture look very strange.

  • Prompt word relevance (CFG Scale)

    The correlation of prompt words is easy to understand. The higher its value, the higher the degree of AI faithfully reflecting your prompt words. However, like the weight, we generally do not float too much. 7-12 is a relatively safe value. High easy edge type.

  • random seed

    Random seed is also an important parameter that can be used to control the consistency of screen content.

  • Generate batch

    Because of the uncertainty of AI painting, even with the same set of prompt words, you need to try again and again, expecting it to give you a picture that perfectly meets your needs at a certain moment. This time process can sometimes be very long, and may take dozens of years. Times, hundreds of times. If you want AI to continuously draw pictures according to the same set of prompt words and parameters, then increase the number of batches, and the drawing process will be repeated continuously.

    After it is finished, it will generate two things - in addition to the pictures of each batch, there will also be a grid preview picture put together for you to compare. Therefore, you can make it come 10 times, 20 times or even hundreds of times in one go.

  • Quantity per batch

    It is not recommended to adjust this. Increasing it will allow you to increase the number of images drawn in each batch. In theory, the efficiency will be higher, but the method of drawing the same batch is to stitch them together as a larger picture and draw it at once, so if your equipment is not good, it is very easy to exhaust the video memory.

4. How to write the prompt word?

For novices, let me summarize three ways to quickly write prompt words:

(1) Translation Dafa

In fact, no matter how complicated these prompt words are, they are still human words. Therefore, when you don't know how to express it, just use natural language to express what you want to draw one by one. Again, SD doesn’t understand Chinese, so you have to use a translation platform to convert it into English first. Although these word expressions are sometimes not absolutely accurate, they at least help you get closer to the picture you want. There are some functional plug-ins that will help you correct inaccurate images into things in the AI ​​dictionary, which is also quite practical. At present, some developers have made many plug-ins for prompt words, which you can explore.

(2) With the help of tools

AI drawing has been popular for some time, and you and I are definitely not the only ones who realize that prompt words are difficult to write. Therefore, some people have developed tools to help you write better prompt words.

Here, I recommend two websites that can be used to assist in writing prompt words:

http://atoolbox.net

https://ai.dawnmark.cn

Their usage is very simple. You can check the ones you need just like selecting parameters, and it will help you automatically put them together according to the grammatical rules we just mentioned. Then you can just copy and paste it into your own SD. Using these tools is like going through a more convenient translation process. But be careful not to limit your ideas by some of the vocabulary he already has. If there is anything else you want to add, you can also try writing it yourself.

(3) Copy homework

In the field of AI painting, copying homework is not a shameful thing, and many creators will take the initiative to share the spells and models they use in their drawings.

https://civitai.com/

https://openart.ai

https://arthub.ai

Guess you like

Origin blog.csdn.net/xianyu120/article/details/133266916