R
RunPod3w ago
Santy

How to make api calls to the endpoints with a System Prompt?

Hi everyone, I’m new to using Runpod’s serverless endpoints for LLM calls. So far, I’ve only worked with OpenAI APIs, and we’ve built a product around GPT-4 models. Now, we’re planning to transition to open-source alternatives. I’ve successfully created serverless endpoints on Runpod for models like Qwen14B Instruct and Llama 8B Instruct. I can get outputs from these models using both the Runpod SDK and the UI with JSON input like this:
{
"input": {
"prompt": "Write a binary code for search in Python",
"sampling_params": {
"max_tokens": 5000,
"temperature": 0.7
}
}
}
{
"input": {
"prompt": "Write a binary code for search in Python",
"sampling_params": {
"max_tokens": 5000,
"temperature": 0.7
}
}
}
However, I want to implement System Prompts and User Prompts similar to OpenAI APIs. When I tried using the OpenAI SDK to interact with the Runpod endpoint, I encountered an Internal Server Error. Here’s my code:
client = OpenAI(
base_url=f"https://api.runpod.ai/v2/{endpoint_id}/openai/v1",
api_key=api_key,
)

chat_completion = client.chat.completions.create(
model="mistralai/Mistral-7B-Instruct-v0.1",
messages=[{"role": "user", "content": "Reply with: Hello, World!"}],
)

print(chat_completion)
client = OpenAI(
base_url=f"https://api.runpod.ai/v2/{endpoint_id}/openai/v1",
api_key=api_key,
)

chat_completion = client.chat.completions.create(
model="mistralai/Mistral-7B-Instruct-v0.1",
messages=[{"role": "user", "content": "Reply with: Hello, World!"}],
)

print(chat_completion)
Does anyone have insights on why this might be happening or how to resolve it? Thanks in advance!
1 Reply
nerdylive
nerdylive3w ago
check worker logs / endpoint logs if theres any error with your request did the request got into the endpoint too?

Did you find this page helpful?