In 'worker configuration', I've selected '48 GB GPU' (A6000, A40). Upon executing an 'endpoints query' (from the documentation: https://docs.runpod.io/sdks/graphql/manage-endpoints"View your Endpoints") to view all of them, the corresponding endpoint ID shows RTX 4090 and A40 as the worker's GPUs. I tried using a POST request through CURL with the corresponding IDs (from the documentation: https://docs.runpod.io/sdks/python/apis"Get GPUs"), but the workers do not any GPUs assigned to them. They do get GPUs assigned when specifying an RTX 4090 instead of an A6000.
Create, modify, or delete serverless endpoints using GraphQL queries and mutations with RunPod API, specifying GPU IDs, template IDs, and other endpoint settings.
Learn how to manage computational resources with the RunPod API, including endpoint configurations, template creation, and GPU management, to optimize your project's performance.
Solution
Hi i tried the graphql and it works with this request:
mutation { saveEndpoint(input: { # options for gpuIds are "AMPERE_16,AMPERE_24,AMPERE_48,AMPERE_80,ADA_24" gpuIds: "AMPERE_48", idleTimeout: 5, # leave locations as an empty string or null for any region # options for locations are "CZ,FR,GB,NO,RO,US" # locations: "", # append -fb to your endpoint's name to enable FlashBoot name: "Generated Endpoint -fb", # uncomment below and provide an ID to mount a network volume to your workers # networkVolumeId: "", scalerType: "QUEUE_DELAY", scalerValue: 4, templateId: "yccyuy2aeh", workersMax: 3, workersMin: 0 }) { gpuIds id idleTimeout locations name # networkVolumeId scalerType scalerValue templateId workersMax workersMin }}
mutation { saveEndpoint(input: { # options for gpuIds are "AMPERE_16,AMPERE_24,AMPERE_48,AMPERE_80,ADA_24" gpuIds: "AMPERE_48", idleTimeout: 5, # leave locations as an empty string or null for any region # options for locations are "CZ,FR,GB,NO,RO,US" # locations: "", # append -fb to your endpoint's name to enable FlashBoot name: "Generated Endpoint -fb", # uncomment below and provide an ID to mount a network volume to your workers # networkVolumeId: "", scalerType: "QUEUE_DELAY", scalerValue: 4, templateId: "yccyuy2aeh", workersMax: 3, workersMin: 0 }) { gpuIds id idleTimeout locations name # networkVolumeId scalerType scalerValue templateId workersMax workersMin }}