Serverless SGLang spent credits on phantom requests
I deployed a serverless endpoint (id
ua6ui6kfksdocn
). I tried sending a sample request from the web dash, that one still seems to be in the queue, 20 hours later.
However, looking a the logs, there are lots of requests like this:
I'm assuming that's what kept the workers alive, spending the credits in vain.
I'm assuming the addresses in the log are source addresses of the request - would that be some runpod process trying to get the list of models?
Any clue on how to resolve this and prevent it from happening in the future?2 Replies
Huh im not sure, might be an attack, im not sure about how sglang or sglang workers work but you can check the code here https://github.com/runpod-workers/worker-sglang/
GitHub
GitHub - runpod-workers/worker-sglang: SGLang is fast serving frame...
SGLang is fast serving framework for large language models and vision language models. - runpod-workers/worker-sglang
feel free to open a support ticket