Connect pods to GKE cluster
Our current inference operations are conducted on Google Kubernetes Engine (GKE). We are interested in leveraging RunPod's GPU offerings for inference tasks. Could you provide information on how to set up and utilize RunPod GPUs for inference purposes?
7 Replies
basically you use an external registry, put the auth details in runpod
https://docs.runpod.io/pods/templates/overview
Overview | RunPod Documentation
Docker templates: pre-configured images with customizable settings for deploying Pods, environment variables, and port management, with options for official, community, and custom templates.
then you create a pod using your image
you can use pods / serverless, https://docs.runpod.io/pods/overview
but this isnt kubernetes each pod = container
Overview | RunPod Documentation
Run containers as Pods with a container registry, featuring compatible architectures, Ubuntu Linux, and persistent storage, with customizable options for GPU type, system disk size, and more.
@nerdylive Is there a way to auto-scale pods based on demand or do we need serverless for that?
You need serverless for that
You can scale pods using rest api
But yeah manually
@nerdylive There are other essential services like DB, Message Queues etc running on my GKE cluster which needs to be connected to my service running on runpod, all these can be configured through ports?
Yes, but ports in runpod aren't ports like in other clouds
Http ports are forwarded via a link ( you can deploy one and see how it works or read runpod docs)
And tcp is also forwarded into another port
I'm not sure about that I haven't tried using them, how do you usually connect them?