R
RunPod•4mo ago
JHenriP

Severless 404

Hi there, I'm getting a 404 error when sending requests on a develpment session (runpodctl project dev). Everything worked great locally using the --rp_serve_api, the only difference is that I changed the url from local host to https://api.runpod.ai/v2/my_pod_id/runsync and added the authentication key to accommodate for the deployment. I'm using postman to send the request Has anyone faced this problem? Can't figure what I'm doing wrong
17 Replies
ashleyk
ashleyk•4mo ago
You can't use my_pod_id it must be a serverless endpoint not a pod.
JHenriP
JHenriP•4mo ago
you're right, thx @ashleyk I misunderstood it @ashleyk not related but do you happen to know if it's mandatory to use rp_cuda? My worker is getting stuck and I don't see GPU usage ramping up
ashleyk
ashleyk•4mo ago
What is rp_cuda?
JHenriP
JHenriP•4mo ago
GitHub
worker-faster_whisper/src/predict.py at main · runpod-workers/worke...
🎧 | RunPod worker of the faster-whisper model for Serverless Endpoint. - runpod-workers/worker-faster_whisper
JHenriP
JHenriP•4mo ago
found it in this repo I'm also doing STT
JHenriP
JHenriP•4mo ago
rn I can't even runsync from here (don't mind what's inside audio_base64, I put that has a placeholder only)
No description
JHenriP
JHenriP•4mo ago
had to cancel all requests manually
Justin Merrell
Justin Merrell•4mo ago
@Marut
Marut
Marut•4mo ago
@JHenriP Are you still facing the issue with the worker ?
JHenriP
JHenriP•4mo ago
Haven't tried since then. Will try again later on today Still facing the same issue @Marut
Marut
Marut•4mo ago
Can you share error? Setup? It should be helpful.
JHenriP
JHenriP•4mo ago
There's no error actually, simply nothing happens and looking at the worker utilization nothing is ramping up. Base image: runpod/base:0.6.1-cuda12.2.0" Requirements: torch hf_transfer accelerate flash-attn transformers runpod Everything worked fine on local deployment
Marut
Marut•4mo ago
Let me check & try to reproduce!
JHenriP
JHenriP•4mo ago
@Marut any updates?
Marut
Marut•4mo ago
Hey, It works fine. I tested.
ashleyk
ashleyk•4mo ago
What is this? Is this something different than the faster whisper link you shared above?
JHenriP
JHenriP•4mo ago
@Merrell @Marut @ashleyk ended up being a problem with flash-attn 🙂 With base image runpod/base:0.6.1-cuda12.2.0 and using an A4000 apparently you can't have flash-attn added to the requirements.txt