Runpod•2y ago

[RUNPOD] Minimize Worker Load Time (Serverless)

Hey fellow developers,

I'm currently facing a challenge with worker load time in my setup. I'm using a network volume for models, which is working well. However, I'm struggling with Dockerfile re-installing Python dependencies, taking around 70 seconds.

API request handling is smooth, clocking in at 15 seconds, but if the worker goes inactive, the 70-second wait for the next request is a bottleneck. Any suggestions on optimizing this process? Can I use a network volume for Python dependencies like I do for models, or are there any creative solutions out there? Sadly, no budget for an active worker.

Thanks for your insights!

Solution

Initializing models over a network volume can inherently be slow bc ur booting from a different harddrive. If u can is easier to bake into the docker image as ashelyk said.

Ur other option is increase idle times after a worker is active that way ur first request is initialized the model into vram and subsequent requests are easy to pick up for the worker

Jump to solution

ashley•1/9/24, 6:24 PM

Your Dockerfile is installing the Python dependencies directly into the docker image/container so there is no need to install them on the network volume, and they most certainly are not the bottleneck.

ashley•1/9/24, 6:25 PM

Dockerfile dependency installation only happens when you build the image, it does not install them when your worker loads.

ashley•1/9/24, 6:26 PM

You probably want to bake your model into the docker image, loading models from network storage is extremely slow, especially for large models.

foxhoundOP•1/9/24, 9:42 PM

you're right, then i guess the issue lies in the initial loading of models into VRAM before preprocessing, disabling models offloading helps when its on. Otherwise, everthing gets reinitialised.

Solution

J.•1/10/24, 1:59 AM

Jack•1/10/24, 4:03 AM

I'm facing a similar issue running A1111 on Serverless Endpoints, it takes about ~60-70 seconds to start up to perform a 3 second generation task.

Is it possible to bake a customized A1111 instance onto a Docker image and having the serverless endpoint loading that docker image directly, skipping the process of having the endpoint load from a Network Volume containing the A1111 instance?

JJack I'm facing a similar issue running A1111 on Serverless Endpoints, it takes about...

J.•1/10/24, 4:20 AM

Yes

J.•1/10/24, 4:20 AM

Your docker file is a self-container snapshot so you can bake whatever models you want into it

J.•1/10/24, 4:20 AM

https://github.com/justinwlin/AudioCraft-Runpod-Serverless-and-GPU-Pod-NOT-A-RUNNING-EXAMPLE

GitHub

GitHub - justinwlin/AudioCraft-Runpod-Serverless-and-GPU-Pod-NOT-A-...

AudioCraft public example runpod. Contribute to justinwlin/AudioCraft-Runpod-Serverless-and-GPU-Pod-NOT-A-RUNNING-EXAMPLE development by creating an account on GitHub.

J.•1/10/24, 4:20 AM

An ex. of me prebaking in an audiocraft ML model into my docker, which loads a preload model script, which essentially executes a function to go and call the model from wherever.

J.•1/10/24, 4:21 AM

https://github.com/justinwlin/runpodWhisperx/blob/master/Dockerfile

WhisperX model, where I call a preload function u can actually see in the repo.

GitHub

runpodWhisperx/Dockerfile at master · justinwlin/runpodWhisperx

Runpod WhisperX Docker Container Repo. Contribute to justinwlin/runpodWhisperx development by creating an account on GitHub.

JJack I'm facing a similar issue running A1111 on Serverless Endpoints, it takes about...

ashley•1/10/24, 6:39 AM

There is no way around this. A1111 takes very long to start up, you will have to use diffusers instead of the bloated monstrosity that is A1111.

ashley•1/10/24, 6:40 AM

@Jack also ensure that you have flash boot enabled.

Aashley There is no way around this. A1111 takes very long to start up, you will have to...

Jack•1/10/24, 6:41 AM

Wait so is it possible loading A1111 onto a Docker image, and skip using a Network Volume like @justin mentioned? I'm not too familiar with Docker but it seems like there are some github repos offering a Docker container for A1111 like this one - https://github.com/AbdBarho/stable-diffusion-webui-docker

GitHub

GitHub - AbdBarho/stable-diffusion-webui-docker: Easy Docker setup ...

Easy Docker setup for Stable Diffusion with user-friendly UI - GitHub - AbdBarho/stable-diffusion-webui-docker: Easy Docker setup for Stable Diffusion with user-friendly UI

ashley•1/10/24, 6:43 AM

@Jack if you want to do this you are on your own or you can pay someone a consulting fee to do it for you. RunPod simply provide the infrastructure to use, they can't hold your hand every step of the way and do everything for you.

JJ.Your docker file is a self-container snapshot so you can bake whatever models yo...

J.•1/10/24, 4:53 PM

@Jack The short answer is yes

However you loaded it into the A1111 on a Network volume is how you can do the same with Docker. All Docker is a text file to tell it how to build a snapshot.

So you can say:
1) Take this image, and put it into the snapshot under this folder.

I wouldn't recommend to use /stable-diffusion-webui-docker b/c you do need it to be based off of the runpod template - will save you a lot of trouble, like my templates / answers above.

I recommend to start with Runpod Pytorch Template on GPU Pod, and see if you can go through manual installation steps, such as installing and so on.

Definitely, I am not familiar with docker when I started, but asking chatgpt and Phdind, is great.

https://phind.com/

I would maybe start off looking at my audiocraft repository, start with a pytorch from base, and just practice COPYING over a test.py file, into a folder, and then pushing it to dockerhub and using it.

https://github.com/justinwlin/FooocusRunpod

This is an ex. of me doing it for Fooocus

GitHub

GitHub - justinwlin/FooocusRunpod

Contribute to justinwlin/FooocusRunpod development by creating an account on GitHub.

J.•1/10/24, 4:53 PM

That way you have low iteration time, then you can increase your complexity

JJ.@Jack The short answer is yes 🙂 However you loaded it into the A1111 on a Net...

Jack•1/11/24, 4:18 AM

Super helpful comment, thanks justin. You're right about not using the pre-made stable-diffusion-webui-docker, I was running into some trouble running it. I'm going to try the approach you mentioned and start with a basic PyTorch template and go from there, see if I can build this dockerfile one line at a time

JJack Super helpful comment, thanks justin. You're right about not using the pre-made ...

J.•1/11/24, 4:19 AM

https://discord.com/channels/912829806415085598/1194695853026328626/1194695853026328626

J.•1/11/24, 4:19 AM

Here is a resource I just wrote on it on getting started / aggregating the different topics on it.

Jack•1/12/24, 4:30 AM

Thanks @justin for that resource. Especially thanks for mentioning depot. It's honestly a life saver for me as I'm using a Macbook for development as well, and dealing with Docker locally is a nightmare. I found an official runpod worker for A1111 by Runpod, but it's not actively maintained and getting issues. Either way it's a great starting point for using A1111 on a worker without needing Network Volumes

https://github.com/runpod-workers/worker-a1111

GitHub

GitHub - runpod-workers/worker-a1111: Automatic1111 serverless worker.

Automatic1111 serverless worker. . Contribute to runpod-workers/worker-a1111 development by creating an account on GitHub.

J.•1/12/24, 4:31 AM

@Jack I have never used it myself but another community member wrote a pretty extensive one here:
https://github.com/ashleykleynhans/runpod-worker-a1111

GitHub

GitHub - ashleykleynhans/runpod-worker-a1111: RunPod Serverless Wor...

RunPod Serverless Worker for the Automatic1111 Stable Diffusion API - GitHub - ashleykleynhans/runpod-worker-a1111: RunPod Serverless Worker for the Automatic1111 Stable Diffusion API

J.•1/12/24, 4:31 AM

He even documented shit much better than a1111y (which i personally found none of lol)

JJack Thanks @justin for that resource. Especially thanks for mentioning depot. It's ...

J.•1/12/24, 4:31 AM

Glad you enjoy depot haha, i do too. im on mac, so also a nightmare. also sometimes sooooooo slow

J.•1/12/24, 4:32 AM

Generative labs is good too:
https://www.youtube.com/@generativelabs

YouTube

Generative Labs

J.•1/12/24, 4:32 AM

I guess the issue with both approach is that they both load from a network volume - which can be slow

J.•1/12/24, 4:33 AM

but maybe that is just necessary for a1111y maybe due to the amount of models / sizes of models

J.•1/12/24, 4:33 AM

Or if there nothign too crazy ur doing:
https://docs.runpod.io/reference/stable-diffusion-xl

Runpod does own their own endpints u can use instead of launching ur own

RunPod

Stable Diffusion XL

A text-to-image model from StabilityAI

Jack•1/12/24, 4:34 AM

Yeah 8GB RAM on Mac using Docker is

Jack•1/12/24, 4:35 AM

Yeah using Network Volume just isn't the way to go for A1111. The cold start time is 60-70 secs for a 3 second photo generation which is too long. Loading everything onto the Docker image is 100% the way to go

J.•1/12/24, 4:35 AM

(i dont use a1111y btw) lol, im a fooocus person xD, so i have zero clue anything about it. and i dont believe fooocus has an api

JJack Yeah using Network Volume just isn't the way to go for A1111. The cold start tim...

J.•1/12/24, 4:35 AM

ooof

JJ.Or if there nothign too crazy ur doing: https://docs.runpod.io/reference/stable-...

Jack•1/12/24, 4:35 AM

I saw that but they use diffusers. I need to load some extensions for A1111 for my app, so I need to use A1111 as a template for my app

J.•1/12/24, 4:36 AM

Ah makes sense

J.•1/12/24, 4:37 AM

Hmm, gl. yeah. if you do end up staying bottle necked on the runpod worker thing, maybe check out ashelyks thing, or as the above message said, start with the pytorch runpod, maybe refer to see if they did anything special for a1111y, and just go line by line

J.•1/12/24, 4:37 AM

a lot of my first times was just asking chatgpt, combine these two docker images xD

J.•1/12/24, 4:37 AM

(with varying success loll)

J.•1/12/24, 4:37 AM

[RUNPOD] Minimize Worker Load Time (Serverless)

Similar Threads

Similar Threads

Similar Threads