Delay Time

Hello, I'm wondering if those Delay Time are normal? If not, what should I do?
36 Replies
digigoblin
digigoblin2y ago
Seems pretty low to me. Depends on what your worker does.
Minozar
MinozarOP2y ago
It takes a bunch of image urls and do ML inference on them Maybe one question, does the delay time increase if the "active and supposed to be warm" worker hasn't actually been working for a while?
Minozar
MinozarOP2y ago
it kinda feels unreliable atm
No description
Minozar
MinozarOP2y ago
No description
Minozar
MinozarOP2y ago
@Papa Madiator
Madiator2011
Madiator20112y ago
?
Minozar
MinozarOP2y ago
I'm experiencing extreme delay time, how can I get them back to normal?
Madiator2011
Madiator20112y ago
I'm not sure what are you doing and running
Minozar
MinozarOP2y ago
atm it just takes a bunch of image urls and calculate their embeddings using a specific ML model
Madiator2011
Madiator20112y ago
did you bake models into docker image?
Minozar
MinozarOP2y ago
my dockerfile looks like this:
FROM nvidia/cuda:12.4.1-cudnn-runtime-ubuntu20.04

RUN apt-get update && apt-get install -y \
python3-pip \
wget \
&& rm -rf /var/lib/apt/lists/*

WORKDIR /app

RUN pip install runpod==1.6.2 torch torchvision torchaudio sentence-transformers==2.7.0

# caching the model
RUN python3 -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('clip-ViT-B-32', device='cpu')"

COPY builder/requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY src/ .

CMD ["python3", "handler.py"]
FROM nvidia/cuda:12.4.1-cudnn-runtime-ubuntu20.04

RUN apt-get update && apt-get install -y \
python3-pip \
wget \
&& rm -rf /var/lib/apt/lists/*

WORKDIR /app

RUN pip install runpod==1.6.2 torch torchvision torchaudio sentence-transformers==2.7.0

# caching the model
RUN python3 -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('clip-ViT-B-32', device='cpu')"

COPY builder/requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY src/ .

CMD ["python3", "handler.py"]
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
This python line does two things: - it downloads the model and save it somewhere - it loads it into the ram If I remove the line and only do that on run time, I'd have to download it each time no ? (or maybe I misunderstood something sorry for that)
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
Usually whats the proper way to bake models into docker images? Any examples?
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Madiator2011
Madiator20112y ago
I would check logs when worker is starting
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
exactly, so that's not very clear to me how to cache models in the docker image, I thought my implementation would work
import numpy as np

# Can I load the model here??

def handler(job):
return True

runpod.serverless.start({"handler": handler})
import numpy as np

# Can I load the model here??

def handler(job):
return True

runpod.serverless.start({"handler": handler})
Is that what you meant?
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
So everything put outside of the handler function will be cached in the docker image??
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
Yes but the documentation says "be sure to cache them into your docker image." How to do that correctly? (the doc doesn't provide enough information I think)
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
If I have no active worker, each time I'll spawn a new one, It'll dowload the model, that's not what I want, I'd like the model to be cached in the docker image so when I spawn a new worker, the model is already almost ready to use
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Madiator2011
Madiator20112y ago
also good idea is that in handler you load model to vram
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Madiator2011
Madiator20112y ago
yup
Madiator2011
Madiator20112y ago
GitHub
worker-sdxl/src/rp_handler.py at main · runpod-workers/worker-sdxl
RunPod worker for Stable Diffusion XL. Contribute to runpod-workers/worker-sdxl development by creating an account on GitHub.
Minozar
MinozarOP2y ago
but this is only to load the model in VRAM, first I'd need to download it from somewhere after reading the documentation I understand it has to be in from the cache of the image, so it has to be linked with the dockerfile somehow ok update, I just checked the dockerfile of the project you shared
# Base image
FROM runpod/base:0.4.2-cuda11.8.0

ENV HF_HUB_ENABLE_HF_TRANSFER=0

# Install Python dependencies (Worker Template)
COPY builder/requirements.txt /requirements.txt
RUN python3.11 -m pip install --upgrade pip && \
python3.11 -m pip install --upgrade -r /requirements.txt --no-cache-dir && \
rm /requirements.txt

# Cache Models
COPY builder/cache_models.py /cache_models.py
RUN python3.11 /cache_models.py && \
rm /cache_models.py

# Add src files (Worker Template)
ADD src .

CMD python3.11 -u /rp_handler.py
# Base image
FROM runpod/base:0.4.2-cuda11.8.0

ENV HF_HUB_ENABLE_HF_TRANSFER=0

# Install Python dependencies (Worker Template)
COPY builder/requirements.txt /requirements.txt
RUN python3.11 -m pip install --upgrade pip && \
python3.11 -m pip install --upgrade -r /requirements.txt --no-cache-dir && \
rm /requirements.txt

# Cache Models
COPY builder/cache_models.py /cache_models.py
RUN python3.11 /cache_models.py && \
rm /cache_models.py

# Add src files (Worker Template)
ADD src .

CMD python3.11 -u /rp_handler.py
------------- I think this is what I was looking for
# Cache Models
COPY builder/cache_models.py /cache_models.py
RUN python3.11 /cache_models.py && \
rm /cache_models.py
# Cache Models
COPY builder/cache_models.py /cache_models.py
RUN python3.11 /cache_models.py && \
rm /cache_models.py
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Minozar
MinozarOP2y ago
well I'm even more confused now... that's on the official repo what's wrong with this dockerfile? I don't understand this part, that's the opposite of what shown on the repo
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
mr47
mr472y ago
@Minozar 1. enable snapshot mode,reduce cold start time 2. Consider speeding up model loading and replacing a model with one that loads faster For cold start, the model will be reloaded every time, so it is normal to be slow.
flash-singh
flash-singh2y ago
enable flashboot if you haven't, if your using queue delay scale, set a lower time to scale faster

Did you find this page helpful?