Serverless Text Embedding - 400

I'm using a text embedding serverless endpoint to run an instance of "sentence-transformers/all-MiniLM-L6-v2". I keep getting a bad request 400 error. The old code I had (using openAI SDK) stopped working and I've tried to configure based on new documentation without any luck. Would greatly appreciate any help! New --------- runpod.api_key = os.getenv("RUNPOD_API_KEY") cleaned_text = {"prompt": "Hello, World!"} endpoint = runpod.Endpoint("i10xxxxxxxxvp") run_request = endpoint.run("Your text to embed here") Old (worked before) ------- cleaned_text = 'This is clean text.' Initialize OpenAI client with RunPod configuration api_key = os.getenv("RUNPOD_API_KEY") client = OpenAI( api_key=api_key, base_url= os.getenv("https://api.runpod.ai/v2/i10rxxxxxxxxxxxp/openai/v1") ) Get embedding using OpenAI-compatible endpoint response = client.embeddings.create( model="sentence-transformers/all-MiniLM-L6-v2", input=cleaned_text ) embedding = response.data[0].embedding
8 Replies
Jason
Jason3w ago
can you share whats the error like
Rahul Bhatewara
Rahul BhatewaraOP3w ago
With runpod.endpoint --------------------- Job output: {'code': 400, 'message': "Invalid input: {'delayTime': 17100, 'id': '31304f73-9a33-4c1d-865d-bad4496705b5-e1', 'input': {'prompt': 'Hello, World!'}, 'status': 'IN_PROGRESS'}", 'object': 'error', 'param': None, 'type': 'BadRequestError'} With OpenAI SDK: openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: rpa_1KJN**bk9q. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
Jason
Jason3w ago
did you export an env variable RUNPOD_API_KEY? set that environment variable to your runpod api key invalid input means your request is invalid try to just use openai sdk for easier use or read the docs in vllm-worker's github
Rahul Bhatewara
Rahul BhatewaraOP3w ago
Tx Jason. The env var RUNPOD_API_KEY is set properly and getting read into the code. I believe the issue could be because I'm running an embedding model vs. chatbot and the input format : {'prompt':'Hello World'} isn't what the endpoint expects. I've looked at the runpod documentation to no avail. Any more thoughts?
Jason
Jason3w ago
GitHub
GitHub - runpod-workers/worker-infinity-embedding: Create embedding...
Create embeddings with infinity as serverless endpoint - runpod-workers/worker-infinity-embedding
Jason
Jason3w ago
i guess you should use this esponse = client.embeddings.create( model="sentence-transformers/all-MiniLM-L6-v2", input=cleaned_text ) but not sure if the model id is correct, try to query it using
client.****
client.****
to get the model ids try to chcek ur logs too Actually, try to set your permission again in your api key( try other permission setting ) let me know which have you tried
Rahul Bhatewara
Rahul BhatewaraOP3w ago
hi Jason. finally, got it working with endpoint passing this json: {"input":{"model":"...", "input": "..."}}. The OpenAI SDK method kept failing after multiple T&E, it kept giving "invalid API" even with a new Key. Thanks for your help!
Jason
Jason3w ago
just directly paste as string in the client = OpenAI( api_key=api_key, base_url= os.getenv("https://api.runpod.ai/v2/i10rxxxxxxxxxxxp/openai/v1") ) if it keeps failing maybe it has something to do with env

Did you find this page helpful?