Runpod•2y ago

How to use Loras in SDXL serverless?

I don't see any docs regarding adding Loras in the workers for SDXL. I am assuming this is the worker that I should be using. https://github.com/runpod-workers/worker-sdxl

GitHub

GitHub - runpod-workers/worker-sdxl: RunPod worker for Stable Diffu...

RunPod worker for Stable Diffusion XL. Contribute to runpod-workers/worker-sdxl development by creating an account on GitHub.

Solution:

Yeah, the runpod SDXL worker doesn't support LoRA

Jump to solution

28 Replies

briefPeach•2y ago

Maybe you can use a comfyui worker so you can just use a Lora comfyui workflow

briefPeach•2y ago

https://github.com/blib-la/runpod-worker-comfy

GitHub

GitHub - blib-la/runpod-worker-comfy: ComfyUI as a serverless API o...

ComfyUI as a serverless API on RunPod. Contribute to blib-la/runpod-worker-comfy development by creating an account on GitHub.

briefPeach•2y ago

I use this ^ it’s great, works in serverless. But u need to learn a bit about comfyui

k916OP•2y ago

Thanks a lot, you might have saved me a lot of time!

Solution

digigoblin•2y ago

Yeah, the runpod SDXL worker doesn't support LoRA

Unknown User•2y ago

Message Not Public

k916OP•2y ago

Yeah, not sure about the syntax or anything unfortunately. @briefPeach Hey Brief, sorry about tagging you. I am wondering about the Dockerfile code, any idea how to add your own models and lora? ADD models/checkpoints/sdxl.safetensors models/checkpoints/ ADD models/loras/sdxl.safetensors models/loras/ I swapped out the run with this code, which are my own models and lora, any idea if that's right? Yeah! That's unfortunate, trying to work out another way using comfyui that brief recommended.

digigoblin•2y ago

There is also this one that I use for my production applications - https://github.com/ashleykleynhans/runpod-worker-comfyui

GitHub

GitHub - ashleykleynhans/runpod-worker-comfyui: RunPod Serverless W...

RunPod Serverless Worker for the ComfyUI Stable Diffusion API - ashleykleynhans/runpod-worker-comfyui

digigoblin•2y ago

It uses network storage so you mount your network storage on a pod and then install all of your custom nodes Then you use a normal pod to create your workflows and send them to the endpoint.

k916OP•2y ago

Thanks a lot! I'll look into this right now, looks promising! I am illiterate with comfyUI, I am assuming custom nodes is the same as throwing in your custom models and loras?

digigoblin•2y ago

No, you use custom nodes to add nodes to your workflows to achieve different results.

k916OP•2y ago

Gotcha, thanks a bunch!

briefPeach•2y ago

Hi to add/download lora, you need to do this in the dockerfile: RUN wget -O models/loras/xl_more_art-full_v1.safetensors https://civitai.com/api/download/models/152309 OR if you want to add your lora file from your own computer without wget downloading, you need to ADD relative/path/to/sdxl.safetensors /comfyui/models/loras/sdxl.safetensors Make sure this file relative/path/to/sdxl.safetensors exists inside the runpod-worker-comfyui folder you pulled. I think the preferred way is the first method (RUN wget) @digigoblin I was also checking out this network volume method. I wonder how do you feel the launching and inference speed is? I assume it's a bit slower than loading everything from container disk without using network volume? Since network volume is not physical attached to your gpu machine

digigoblin•2y ago

wget is not the preferred way, its actually better to use COPY/ADD so that you don't need to download the model every single time you build your docker image

briefPeach•2y ago

Also have you used network volume in scale? for example, if i have 5 serverless workers pointing to the same network volume, will the file i/o speed be ok?

Unknown User•2y ago

Message Not Public

digigoblin•2y ago

Yes, network volume disk is slower

briefPeach•2y ago

interesting, how much slower do you feel like? (i know it's hard to quantize) but want to get a rough feeling like 10%, 50%? Right now, I'm using container disk, it's 2 - 5 seconds inference for the default comfyui text 2 image workflow (cheapest 16G CPU, A4000, inference only, assuming server is already boot up, model downloaded already) I wonder how much inference time would i expect if using network volume

Unknown User•2y ago

Message Not Public

briefPeach•2y ago

got it! then it's not too bad, acceptable I think. thank you! It's very useful information oh yea this is a valid point. If you frequently rebuild your image, you should use ADD/COPY

digigoblin•2y ago

I often build it multiple times while testing

Unknown User•2y ago

Message Not Public

briefPeach•2y ago

Yeah I'm about to try to run that but I worry multiple pods read/write into same network volume in parallel would make it even slower i/o 😂

Unknown User•2y ago

Message Not Public

briefPeach•2y ago

thank you i'll give it a try!

Madiator2011•2y ago

sdxl worker is bassed on difusser format of sdxl

briefPeach•2y ago

Ok i finally tried it and have some benchmarks using network volume in serverless using default comfyui workflow text 2 image, sd1.5, model is already downloaded and saved in network volume comfyui/models folder the first run is slowest, around 15sec pure inference + 2s uploading to s3 bucket (in total 17s as you can see in screenshot) the subsequent runs are quicker, 2 - 3 seconds in total (without uploading to s3, it's around 1 - 2 sec) the reason that the 1st run is slow is because it has to load the model into vram I guess? i'm using 16GB cheapest GPU. So I think the speed looks ok!

Unknown User•2y ago

Message Not Public

Gaming

Programming

How to use Loras in SDXL serverless?

Did you find this page helpful?