R
Runpodβ€’10mo ago
jackson hole

How is the architecture set up in the serverless (please give me a minute to explain myself)

We have been looking for the LLM hosting services and autoscaling functionality to make sure we meet the demand -- but our main concern is the authentication architecture design. The basic setup Based on my understanding there are the following layers: 1. Application in the user's device (sends request) 2. A dedicated authentication server checks the user's authenticity (by API key, bearer etc and rate limits) 3. Our HTTP server takes that request, processes the data and sends the request to the LLM server (to runpod - serverless) 4. Runpod returns some generated data, and finally the HTTP server post-processes it and sends back to the user. --- We want to: - Make sure no unauthorized device is accessing our API to LLM - To track each user's leftover quota and only let them to send a couple of requests etc. πŸ‘‰πŸ» As you can see, we have certain authentication related thoughts -- but I need more granular understanding of what standard practices are when deploying LLMs for the commercial use which the real customers are going to use. Please guide. Thank you.
10 Replies
Unknown User
Unknown Userβ€’10mo ago
Message Not Public
Sign In & Join Server To View
jackson hole
jackson holeOPβ€’10mo ago
Damn, the visualization (the image attached -- if that's what you meant -- was just to grab attention -- πŸ˜… ) removed. The question is rather "an ask for guidence" on the standard architecture design while deploying the LLMs with authentication. If basic setup is good enough, then okay, otherwise you may guide more, thanks/.
Unknown User
Unknown Userβ€’10mo ago
Message Not Public
Sign In & Join Server To View
jackson hole
jackson holeOPβ€’10mo ago
Alrighty, then I guess I should go ahead with that visualization.
Unknown User
Unknown Userβ€’10mo ago
Message Not Public
Sign In & Join Server To View
jackson hole
jackson holeOPβ€’10mo ago
Fabulous. Thanks. ✨ One thing... Generally the security is on our end which we need to decide. I mean, how do we want to proceed with authentication. There are several options like: - Basic authentication (sending uname-pass in header -- least secure) - Some dynamic token -- encrypt with SHA and that sort of stuff - Create API key per user account (just like OpenAI) and use that etc... Let's say we have selected any of the techniques, then, is there any predefined framework that we can use or, do we need to code these logic from scratch? I have heard of "AWS API Gateway" but not sure about its relevance. We are using FastAPI as our HTTP request handler and that will sent the request to the runpod for context. So, the question: Should we write the authentication logic, or are there libraries/services that can do these for us? Thanks mate
Unknown User
Unknown Userβ€’10mo ago
Message Not Public
Sign In & Join Server To View
jackson hole
jackson holeOPβ€’10mo ago
I see, that will basically replace our "authentication server layer". Thanks a lot -- looking forward to implementing these soon ✌🏻
Unknown User
Unknown Userβ€’10mo ago
Message Not Public
Sign In & Join Server To View
jackson hole
jackson holeOPβ€’10mo ago
Absolutely mate

Did you find this page helpful?