Custom transformer reranker endpoint. To offload to runpod serverless?
🏗️Builder
Have a VM where I serve products. I want to rerank them based on a custom personalized transformer (personalized to user history); this means sending the user history + 500 items that need to be reranked to the runpod server.
I understand this will add more latency if my CPU-only VM server is on OVH and I have to query a runpod api endpoint to do inference. Any thoughts on my architecture?
Recent Announcements
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!