(Beginner Question) Hosting Quantized model
using RTX 4090 Pod want to create API from civitai diffuser models - is cog best way?
trl vllm-serve not binding to port
vLLM
on one of them via:
VLLM_LOGGING_LEVEL=DEBUG NCCL_DEBUG=TRACE trl vllm-serve --model meta-llama/Meta-Llama-3-8B-Instruct --gpu_memory_utilization=0.75 --max_model_len 2048 --host 0.0.0.0 --port 8000
VLLM_LOGGING_LEVEL=DEBUG NCCL_DEBUG=TRACE trl vllm-serve --model meta-llama/Meta-Llama-3-8B-Instruct --gpu_memory_utilization=0.75 --max_model_len 2048 --host 0.0.0.0 --port 8000
Any easy pipeline to migrate from GCP Cloud Compute VM Instance to Runpod Cluster?
how do i do this?
Bug in Runpod ComfyUI Network Volume Setup
/workspace/comfyui
folder is not the actual one. The actual one is in /ComfyUI
which is outside of the network volume mount. This means that if you terminate the pod, your progress is lost. I think that the folder should be inside the volume mount so the state persists between pod reinitialisationsweb ui was demanding i pay just to start a pod, but i have plenty of credits

GPU Suddenly Stopped Working
custom docker which uses streamlit and postgresql
Unable to start kernel
Assistance with SSH Access to My Pod
Global Networking

[GPU not assigned to Pod – Need help]

Network Issue
Unable to back up volume data to Google Cloud storage bucket

Can't send using RunPodctrl and can't resend
Not Work

runpod-torch-v280 & RTX 4090 unsatisfied condition: cuda>=12.8
What will happen if i resume pod on fully occupied physical machine via api