Stable Diffusion – Giantess guide v1.
Hardware
To run Stable Diffusion locally you need:
- A Nvidia Graphics Card with at least 6.9GB VRAM
- 10GB Hard Drive
- 8GB Memory (RAM)
You can try to run it without fulfilling the requirements, but it might not work. I have 12 GB of
VRAM and plenty of RAM and space on my hard drive.
Software
AUTOMATIC1111: https://github.com/AUTOMATIC1111/stable-diffusion-webui
CyberRealistic v3.3: https://civitai.com/models/15003/cyberrealistic
you9134's Giantess LoRA: https://civitai.com/models/144449/you9134s-giantess-lora
Install Automatic1111 and then put the CyberRealistic file in the stable-diffusion-webui\
models\Stable-diffusion folder and the Giantess LoRA in the stable-diffusion-
webui\models\Lora folder.
Settings
Set the checkpoint to CyberRealistic. I use 20 Sampling steps, DPM++ 2M Karras as the Sampling
method and CFG scale 7.
For width and height I use 512×768 or 640×640 depending on the intended pose of the giantess.
The Batch size determines how many images are generated in parallel. I set this to 6. If your GPU
doesn’t have a lot of VRAM you might get an error if this value is too high. In that case decrease
the value until it works.
Batch count determines how many batches will be made sequentially. If you prefer you can set this
to a high value and then once it’s done go through all the generate images and pick the best ones.
Prompting
The prompts determine what image will be generated. The program will try to make an image that
corresponds to the text in the prompt and is dissimilar to what is in the negative prompt.
You can increase or decrease the weight of parts of the prompt by putting words in parentheses
and adding a number after a colon. For instance typing (city destruction:1.25) will
produce more destruction effects than just typing city destruction. I normally don’t use a
weight higher than 1.5 since that might introduce strange artifacts. Also if you have the name of a
person in a prompt you should not give that name any weight unless you want them to look like a
caricature of themselves.
For the negative prompt I always add (worst quality, low quality:1.3), logo,
watermark, signature, (horns), (wings), monochrome, illustration,
painting, cartoon, 3d, bad art, poorly drawn, blurry, disfigured,
deformed, extra limbs, but sometimes I add a few more words.
What I put in the positive prompt depends on what I am trying to create, but you should always
add <lora:gb4:0.8> to the end of the prompt in order to activate the Giantess LoRA. As with
prompt words 0.8 is the weight. I’ve found this value to work well, but you can try to increase or
decrease it.
For setting the size of the giantess you may use the words mini, mega, and giga, although this only
has minimal impact I would still recommend using them. For greater effect use terms such as high
altitude photography or aerial view for a bigger giantess, and skyscrapers for a
smaller one.
Most of the generated images will have some issues, such as incorrect anatomy or a mixture of
building sizes or destruction effects in places that makes no sense. This is to be expected, however
if you generate a lot of images, let’s say ca 60, and none of them are good then your prompt is
probably not good. Try to modify it until you get better results. What makes a good and a bad
prompt is not clear to me, so you will just have to experiment until you get something that works.
However, trying to generate multiple giantesses means that the anatomy can get messed up for
several persons, so getting this right is hard and in general not worth it in my opinion.
Example prompts:
Prompt: giantess, attractive woman, (city destruction:1.4), playful, metropolis, sneakers, t-shirt,
jeans shorts, urban sprawl, Mexico, <lora:gb4:0.8>
Negative prompt: (worst quality, low quality:1.3), logo, watermark, signature, (horns), (wings),
monochrome, illustration, painting, cartoon, 3d, bad art, poorly drawn, blurry, disfigured,
deformed, extra limbs, plane wing, night, mini, dawn, dusk, muted colors
Prompt: mega, giantess, attractive woman, blonde, (city destruction:1.4), playful, skyscrapers,
cocktail dress, Miami, <lora:gb4:0.8>
Negative prompt: (worst quality, low quality:1.3), logo, watermark, signature, (horns), (wings),
monochrome, illustration, painting, cartoon, 3d, bad art, poorly drawn, blurry, disfigured,
deformed, extra limbs, plane wing, night, mini, dawn, dusk, muted colors
Prompt: mini, giantess, attractive african american woman, frizzy hair, sitting on building,
skyscrapers, New York, street view, <lora:gb4:0.8>
Negative prompt: (worst quality, low quality:1.3), logo, watermark, signature, (horns), (wings),
monochrome, illustration, painting, cartoon, 3d, bad art, poorly drawn, blurry, disfigured,
deformed, extra limbs, plane wing, night, dawn, dusk, muted colors
Prompt: (giga:1.33), giantess, attractive woman, (city destruction:1.4), playful, sitting, on her
knees, metropolis, urban sprawl, Tokyo, aerial view, <lora:gb4:0.8>
Negative prompt: (worst quality, low quality:1.3), logo, watermark, signature, (horns), (wings),
monochrome, illustration, painting, cartoon, 3d, bad art, poorly drawn, blurry, disfigured,
deformed, extra limbs, plane wing, night, mini, dawn, dusk, muted colors
For more examples you can see the generated images I’ve uploaded. I will add the prompt used to
create them in the description of all of them.
Upscaling
For upscaling I use this workflow:
https://www.reddit.com/r/StableDiffusion/comments/13v461x/a_workflow_to_upscale_to_4k_re
solution_with/, except that I use DPM++ 2M Karras instead of DDIM. Note that this workflow
requires you to install some extra stuff, see the link for details. If the guide is unclear I can try to
add some extra info here.
My workflow
1. Invent some prompt.
2. Generate images until you get one that you are happy with.
3. Go to the PNG info tab.
4. Drag the image you generated into the 'Source' window or click to upload.
5. Click 'Send to img2img'.
6. Go to the 'img2img' tab.
7. Enter the settings you want (see linked workflow above) and then click 'Generate'.
8. Once this is done you can drag the upscaled image that you can find to the right into the
img2img tab that you can find to the left to replace the image that will be upscaled.
9. You can now change your settings and click 'Generate' again to upscale again.
10. Repeat upscaling as many times as you want. I do three rounds of upscaling (as in the
workflow linked above), but two is probably good enough, or at the very least once.
When upscaling you might want to tweak your prompt somewhat. Most notably I’ve noticed that
destruction effects might be added where they make no sense if terms such as city
destruction or fire have to much weight, so you might want reduce these weights, perhaps
even to values below 1.
Optional manual steps
Even if you manage to generate a really good image there will probably be some issues that you
might want to manually correct. For instance there might be a huge building that makes no sense
that you want to remove or destruction in places that make no sense. Feet and hands on the
ground also often look weird, you can obscure these by adding some smoke in front of them. You
can do your corrections in e.g. Photoshop or GIMP. The corrections don’t have to be very precise
since details are invented when upscaling. I normally only spend from 5 to probably at most 15
minutes doing corrections. You can then save the corrected image as a PNG and drag it into the
img2img tab to upscale this image instead of the original (don’t forget to add the prompt).
Corrections may have to be made after every upscaling since new errors might be introduced.
Corrections are of course not necessary, but I would recommend them to make your images look
better.
You can also use the inpainting feature in Automatic1111 to improve e.g. faces (common errors is
that an iris might be deformed or an ear looks odd) or other things that look a bit weird. How I do
it is that I upscale until the thing that I want to inpaint will fit inside a square of size 512×512 or
640×640. I then cut out this part and save it as a PNG. I then drag it to the inpaint window under
img2img. For the prompt I then add the name of the person if it is a face, otherwise whatever the
image depicts. For the negative prompt I add the same text as when I generated the base image. I
set Mask blur to about 16, Masked content to original, Width and Height to whatever size the
image is, sampling method to DPM++ 2M Karras, and Denoising strength to be about 0.6 (modify if
results are not good). Then I fill in the area that needs to be fixed using the paint brush. I then
generate a few images and take the one that looks best. Then I take this image and replace the cut
out part with it.