R
Runpod2mo ago
swousy

Completion-style instead of Instruct-style responses

I'm using the default hello world code on serverless: "{ "input": { "prompt": "Hello World" } }" but getting a completion-style response instead of an instruct-style response, despite using Llama instruct. To clarify, the response I'm getting is: "! Welcome to my blog about London: the Great City!...". How do I change the prompt format to get instruct-style responses? Where can I find the syntax?
5 Replies
swousy
swousyOP2mo ago
This page says to check my worker's documentation - where do I see that? https://docs.runpod.io/serverless/endpoints/send-requests
swousy
swousyOP2mo ago
I just want to know what the syntax is to submit a conversation, like "messages": [ {"role": "user", "content": "Give me a short introduction to large language models."} ]
3WaD
3WaD2mo ago
The syntax is exactly as you say.
{
"input": {
"messages": [{"role": "user", "content": "Give me a short introduction to large language models."}]
}
}
{
"input": {
"messages": [{"role": "user", "content": "Give me a short introduction to large language models."}]
}
}
3WaD
3WaD2mo ago
The vLLM worker documentation is here
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm
swousy
swousyOP2mo ago
@3WaD Please could you give me an example of how to use the CUSTOM_CHAT_TEMPLATE, as the above doesn't work with LLama 3.1 8B Instruct - but I don't know how to use a different chat template Nevermind I just realised how to do it - ignore me

Did you find this page helpful?