Serverless Text Embedding - 400
I'm using a text embedding serverless endpoint to run an instance of "sentence-transformers/all-MiniLM-L6-v2". I keep getting a bad request 400 error. The old code I had (using openAI SDK) stopped working and I've tried to configure based on new documentation without any luck. Would greatly appreciate any help!
New
---------
runpod.api_key = os.getenv("RUNPOD_API_KEY")
cleaned_text = {"prompt": "Hello, World!"}
endpoint = runpod.Endpoint("i10xxxxxxxxvp")
run_request = endpoint.run("Your text to embed here")
Old (worked before)
-------
cleaned_text = 'This is clean text.'
Initialize OpenAI client with RunPod configuration
api_key = os.getenv("RUNPOD_API_KEY")
client = OpenAI(
api_key=api_key,
base_url= os.getenv("https://api.runpod.ai/v2/i10rxxxxxxxxxxxp/openai/v1")
)
Get embedding using OpenAI-compatible endpoint
response = client.embeddings.create(
model="sentence-transformers/all-MiniLM-L6-v2",
input=cleaned_text
)
embedding = response.data[0].embedding
8 Replies
can you share whats the error like
With runpod.endpoint
---------------------
Job output: {'code': 400, 'message': "Invalid input: {'delayTime': 17100, 'id': '31304f73-9a33-4c1d-865d-bad4496705b5-e1', 'input': {'prompt': 'Hello, World!'}, 'status': 'IN_PROGRESS'}", 'object': 'error', 'param': None, 'type': 'BadRequestError'}
With OpenAI SDK:
openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: rpa_1KJN**bk9q. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
did you export an env variable RUNPOD_API_KEY?
set that environment variable to your runpod api key
invalid input means your request is invalid
try to just use openai sdk for easier use or read the docs in vllm-worker's github
Tx Jason. The env var RUNPOD_API_KEY is set properly and getting read into the code. I believe the issue could be because I'm running an embedding model vs. chatbot and the input format : {'prompt':'Hello World'} isn't what the endpoint expects. I've looked at the runpod documentation to no avail. Any more thoughts?
are you using this https://github.com/runpod-workers/worker-infinity-embedding
GitHub
GitHub - runpod-workers/worker-infinity-embedding: Create embedding...
Create embeddings with infinity as serverless endpoint - runpod-workers/worker-infinity-embedding
i guess you should use this
esponse = client.embeddings.create(
model="sentence-transformers/all-MiniLM-L6-v2",
input=cleaned_text
)
but not sure if the model id is correct, try to query it using to get the model ids
try to chcek ur logs too
Actually, try to set your permission again in your api key( try other permission setting )
let me know which have you tried
hi Jason. finally, got it working with endpoint passing this json: {"input":{"model":"...", "input": "..."}}. The OpenAI SDK method kept failing after multiple T&E, it kept giving "invalid API" even with a new Key. Thanks for your help!
just directly paste as string in the
client = OpenAI(
api_key=api_key,
base_url= os.getenv("https://api.runpod.ai/v2/i10rxxxxxxxxxxxp/openai/v1")
)
if it keeps failing
maybe it has something to do with env