RunPod•17mo ago

Issue with Dependencies Not Being Found in Serverless Endpoint

I am encountering an issue with a network volume I created: First,I created a network volume and used it to set up a pod. During this setup, I modified the network volume: In the directory where the network volume was mounted, I created and activated a virtual environment (venv). I then installed various dependencies in this environment. Then, I have created a serverless endpoint that utilizes this network volume. As far as I understand, this network volume is mounted on the directory runpod-volume. I initiate the venv located in this directory and then start a program that is also stored there. However, I soon encounter a problem: the dependencies that I had installed are not being found. Could you please help me identify where I might be going wrong in this process? It seems like the dependencies installed in the venv are not being recognized or accessed by the serverless endpoint. Thanks

Solution:

I think it is HIGHLY better to just bake the dependencies into the Dockerfile and activate it that way. Also without seeing ur dockerfile is hard...

Jump to solution

14 Replies

ashleyk•17mo ago

venv's are stupid and hard code the paths, so if you create it using /workspace in GPU cloud, it will have that hard-coded into the venv itself. I work around this by using symbolic links in serverless. eg.

echo "Symlinking files from Network Volume"
rm -rf /workspace && \
  ln -s /runpod-volume /workspace
source /workspace/venv/bin/activate

echo "Symlinking files from Network Volume"
rm -rf /workspace && \
  ln -s /runpod-volume /workspace
source /workspace/venv/bin/activate

DaanOP•17mo ago

isn't changing the name of workspace to runpod-volume here the better/easier option?

ashleyk•17mo ago

You can try that too

DaanOP•17mo ago

Is there something special to a directory named "workspace" in RunPod? Because now that I changed the name of workspace directory to runpod-volume, is have this strange porblem (see message.txt) where I download all the dependencies and then execute a file that needs them. In the logs I see this error (but I do not understand why):

 2024-01-01T16:53:06.410909616Z Traceback (most recent call last):
2024-01-01T16:53:06.410938427Z   File "/runpod-volume/./app.sh", line 5, in <module>
2024-01-01T16:53:06.413309706Z     import torch
2024-01-01T16:53:06.413320126Z ModuleNotFoundError: No module named 'torch'

 2024-01-01T16:53:06.410909616Z Traceback (most recent call last):
2024-01-01T16:53:06.410938427Z   File "/runpod-volume/./app.sh", line 5, in <module>
2024-01-01T16:53:06.413309706Z     import torch
2024-01-01T16:53:06.413320126Z ModuleNotFoundError: No module named 'torch'

. I gues then these dependencies also aren't saved in the netword volume... (Yes, I changed the mount path in my template to "/runpod-volume")

message.txt

DaanOP•17mo ago

(this may be more a question for GPU cloud, but I will keep it here if that's okay)

nathaniel•17mo ago

that latter pip install output says you're running out of disk in whatever place you're installing the dependencies

...
Installing collected packages: sentencepiece, mpmath, cymem, bitsandbytes, wasabi, urllib3, typing_extensions, tqdm, sympy, spacy-loggers, spacy-legacy, sniffio, smart-open, safetensors, regex, PyYAML, psutil, protobuf, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, murmurhash, msgpack, MarkupSafe, langcodes, idna, h11, greenlet, fsspec, filelock, distro, cloudpathlib, click, charset-normalizer, certifi, catalogue, annotated-types, typer, triton, srsly, scipy, requests, pynvim, pydantic_core, preshed, nvidia-cusparse-cu12, nvidia-cudnn-cu12, Jinja2, httpcore, blis, anyio, pydantic, nvidia-cusolver-cu12, huggingface-hub, httpx, torch, tokenizers, openai, confection, weasel, transformers, thinc, accelerate, spacy, en-core-web-sm
ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device

...

...
Installing collected packages: sentencepiece, mpmath, cymem, bitsandbytes, wasabi, urllib3, typing_extensions, tqdm, sympy, spacy-loggers, spacy-legacy, sniffio, smart-open, safetensors, regex, PyYAML, psutil, protobuf, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, murmurhash, msgpack, MarkupSafe, langcodes, idna, h11, greenlet, fsspec, filelock, distro, cloudpathlib, click, charset-normalizer, certifi, catalogue, annotated-types, typer, triton, srsly, scipy, requests, pynvim, pydantic_core, preshed, nvidia-cusparse-cu12, nvidia-cudnn-cu12, Jinja2, httpcore, blis, anyio, pydantic, nvidia-cusolver-cu12, huggingface-hub, httpx, torch, tokenizers, openai, confection, weasel, transformers, thinc, accelerate, spacy, en-core-web-sm
ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device

...

torch and its dependencies are in that list which is why it can't import torch later so maybe the size of the network volume is smaller than the size of local disk for your container?

DaanOP•17mo ago

the size of the network volume is 250GB, so that won't be the problem here. when installing, I also only see the disk utilization of the container increasing, so maybe these dependencies are just installed on the container disk after all (but I think the commands I gave should be correct)? Also, I do not fully understand why this happens: If I have downloaded some dependencies in the venv, so they are saved in the network volume, I can reuse them because executing files (in the venv) that use the dependencies works fine. But, if I want to reinstall these dependencies I don't get something like "Requirement already satisfied:.....". Also "pip list" doesn't show these dependencies. This is really strange to me.

ashleyk•17mo ago

pip uses the user home directory to cache things by default, you can try add --no-cache-dir to your pip commands to prevent it from caching things on the container disk. It solved it here: https://discord.com/channels/912829806415085598/1191380773425659924

DaanOP•17mo ago

yes, I use --no-cache-dir , but still I got the error. I thought it was solved because I didn't get the same error (I honestly didn'tt see tgis one-line error at first) this is the output now (message.txt) But I think everything is installed correctly since I can execute a file that needs these dependencies, but I find it strange that if I execute the same command twice (after a new pod has been started with the same network volume) nothing comes up. as "Requirement already satisfied:.....". But running the program does work. However, when I start a serverless endpoint and want to do the same, I get the error that those dependencies are not installed.

ashleyk•17mo ago

Make sure your path names match exactly, venv is hard-coded as I mentioned intially.

Solution

J.•17mo ago

I think it is HIGHLY better to just bake the dependencies into the Dockerfile and activate it that way. Also without seeing ur dockerfile is hard

J.•17mo ago

Using venv through ur runpod volume is kind of a waste of time - more frustration that it is worth and still increased latency pulling packages across two diff drives

J.•17mo ago

https://github.com/justinwlin/runpodWhisperx/blob/master/Dockerfile

GitHub

runpodWhisperx/Dockerfile at master · justinwlin/runpodWhisperx

Runpod WhisperX Docker Container Repo. Contribute to justinwlin/runpodWhisperx development by creating an account on GitHub.

J.•17mo ago

Can refer to my whisperx dockerfile where I do create a venv and activate it for the env to stabilize all my dependencies especially cause this is serverless and assuming u are storing any rlly heavy models on the network storage - i dont see any reasons why a venv for dependencies backed into ur dockerfile itself would be bad. ud still have a super small dockerfile / fast to deploy / fast to spin up

Gaming

Programming

Issue with Dependencies Not Being Found in Serverless Endpoint

Did you find this page helpful?