Runpod•7d ago

Monitoring

Is there a way to monitor the CPU and GPU utilization and scrape these metrics with Prometheus, since I need to visualize it with Grafana dashboards and use my own alerting?

7 Replies

Dj•7d ago

Hey! Not yet, but this is a common feature request I want to have supported soon.

RubinartOP•6d ago

Okay, thank you for the reply I'd also like to ask, is there any additional charge for amount of graphql requests?

Dj•6d ago

No! We have a ratelimit, but it's very generous to my knowledge.

RubinartOP•5d ago

since i will be sending a graphql request every 5 minutes for example to get the metrics which i need and base my alerts on them, is there a possibility of me hitting that limit? i am curious as to whether i have to set up an alert for too many requests and find another way to handle it

Unknown User•3d ago

Message Not Public

Dj•3d ago

Great idea, do push instead of pull in general for "ephemeral" workloads and you can do stuff like tag it with the data of the pod and runtime from inside the container rather than assuming or querying but if you decide to pull it's very unlikely we'll get in your way.

riverfog7•2d ago

About 2 requests per second frpm my experience

Gaming

Programming

Monitoring

Did you find this page helpful?