Setting the baseline
Before heading to the optimization solutions, let’s take a look at the speed and VRAM usage with the default settings so that we know how much VRAM usage has been reduced or how much the speed has been improved after applying an optimization solution.
Let’s use a non-cherry-picked number 1
as the generator seed to exclude the impacts from the randomly generated seed. The tests are conducted on an RTX 3090 with 24 GB VRAM running Windows 11, with another GPU for rendering all other windows and the UI so that the RTX 3090 can be dedicated to the Stable Diffusion pipelines:
import torch from diffusers import StableDiffusionPipeline text2img_pipe = StableDiffusionPipeline.from_pretrained( Â Â Â Â "runwayml/stable-diffusion-v1-5" ).to("cuda:0") # generate an image prompt ="high resolution, a photograph of an astronaut riding a horse" image = text2img_pipe( Â Â Â Â prompt = prompt, Â ...