$0 balance in my account
vllm + Ray issue: Stuck on "Started a local Ray instance."
TheBloke/goliath-120b-AWQ on vllm + runpod with 2x48GB GPUs:
``
2024-02-03T12:36:44.148649796Z The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling transformers.utils.move_cache()`.
2024-02-03T12:36:44.149745508Z
0it [00:00, ?it/s]...Similar speed of workers on different GPUs
Docker daemon is not started by default?
VLLM Worker Error that doesn't time out.
2024-02-01T18:08:19.928745487Z {"requestId": null, "message": "Traceback: Traceback (most recent call last):\n File \"/usr/local/lib/python3.11/dist-packages/runpod/serverless/modules/rp_job.py\", line 55, in get_job\n async with session.get(_job_get_url()) as response:\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/client.py\", line 1187, in __aenter__\n self._resp = await self._coro\n ^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/client.py\", line 601, in _request\n await resp.start(conn)\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/client_reqrep.py\", line 965, in start\n message, payload = await protocol.read() # type: ignore[union-attr]\n ^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/streams.py\", line 622, in read\n await self._waiter\naiohttp.client_exceptions.ClientOSError: [Errno 104] Connection reset by peer\n", "level": "ERROR"}
2024-02-01T18:08:19.929440753Z {"requestId": null, "message": "Failed to get job. | Error Type: ClientOSError | Error Message: [Errno 104] Connection reset by peer", "level": "ERROR"}
2024-02-01T18:08:19.928745487Z {"requestId": null, "message": "Traceback: Traceback (most recent call last):\n File \"/usr/local/lib/python3.11/dist-packages/runpod/serverless/modules/rp_job.py\", line 55, in get_job\n async with session.get(_job_get_url()) as response:\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/client.py\", line 1187, in __aenter__\n self._resp = await self._coro\n ^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/client.py\", line 601, in _request\n await resp.start(conn)\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/client_reqrep.py\", line 965, in start\n message, payload = await protocol.read() # type: ignore[union-attr]\n ^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/dist-packages/aiohttp/streams.py\", line 622, in read\n await self._waiter\naiohttp.client_exceptions.ClientOSError: [Errno 104] Connection reset by peer\n", "level": "ERROR"}
2024-02-01T18:08:19.929440753Z {"requestId": null, "message": "Failed to get job. | Error Type: ClientOSError | Error Message: [Errno 104] Connection reset by peer", "level": "ERROR"}
refresh_worker does it but don't think it works for the RunPod internal stuff, its more for when your handler raises an Exception, but @Justin Merrell will have to confirm. I assume this is the latest version of the SDK?quick python vLLM endpoint example please?
Best way to deploy a new LLM serverless, where I don't want to build large docker images
Pause on the yield in async handler
worker-vllm cannot download private model
How do I select a custom template without creating a new Endpoint?
Slow initialization, even with flashboot, counted as execution time
worker vllm 'build docker image with model inside' fails
Option 2: Build Docker Image with Model Inside To build an image with the model baked in, you must specify the following docker arguments when building the image. ...
Getting TypeError: Failed to fetch when uploading video
SSLCertVerificationError from custom api
Does async generator allow a worker to take off multiple jobs? Concurrency Modifier?
Does Runpod provide startup free computes grant?
Custom Checkpoint Model like DreamShaper
How to force Runpod to pull latest docker image?
Endpoint creation can't have envs variables
Add variables is gray so i can't click on it ...
Just in case i already tried to refresh the UI and checked in Devtools to see if the API was the cause but no....
How to get around the 10/20 MB payload limit?