What is the best practice to handle my own load balancing for replicas on railway?

Coming after having read the blogpost : https://blog.railway.app/p/launch-week-01-horizontal-scaling My current setup looks like this : a "front" server serving static content and sending requests to "back" server, each request includes session_id which is also tied to user_id. I need to make sure that once the user makes the first hit and has "active" session, their subsequent requests hits the same "back" server replicas (because well, they're not exactly REST-ful as you might've guessed, and I really do need to keep it that way). Again, for various reasons, keeping all session's data in some db and switching to REST-ful approach is not ideal for me in the long run, but I could see myself spinning a db to "remember" the replica for user id / session id. How would you set such thing up? Open to suggestions, would also like the feedback on, well, the easiest way to "plug in" into the railway load balancing ie how would I efficiently do this : "One important note: If you don’t like this random load balancing, you can stick your own NGINX/HAProxy middle-proxy (using our DNS to discover backends) and load balance with any algorithm you like." Not quite sure about the "using our DNS to discover backends" and the exact way I'd do it 😅 I guess with the db to store "metadata" I could use it to always direct user to correct replica, but agian, I'd need pointers to how I'd populate db with correct routes ho hit in, say, internal railway network with multiple replicas.
Railway Blog
So You Think You Can Scale?
Wanna know how Horizontal Scaling works? Well, it’s a tale in two parts: orchestration and networking. Let’s dig into the digital dirt and untangle the bits and bytes that make this scaling magic happen.
31 Replies
Percy
Percy9mo ago
Project ID: N/A
alexbalandi
alexbalandi9mo ago
N/A
Brody
Brody9mo ago
whatever the outcome, it will involve you updating your code or rolling your own load balancing solution, so you might wanna ask yourself, are replicas even necessary for your use case?
alexbalandi
alexbalandi9mo ago
Yup. It is for production for my small team's chatbot and while probably we won't hit that many users, yes, we need the ability to scale and I gotta prepare for it. So the question for me rn is "how", not "sould I?" NinoFace railway discord doesn't display this well, but I also have a pro seat on a team, this question is for that, not hobby projects 🙈
Brody
Brody9mo ago
well the answer you won't like, update your code to work with native replicas
alexbalandi
alexbalandi9mo ago
I mean, I've quoted the suggested solution, I'd just like to have more pointers for it if it's easy for someone to provide it. \
Brody
Brody9mo ago
and when they are talking about using dns to discover backend replicas so that you could roll your own load balancer, they are talking about this https://utilities.up.railway.app/dns-lookup?value=hello-world.railway.internal&type=ip you would then just use a proxy that supports grabbing dynamic upstream hosts from an A/AAAA lookup and set your own load balancing mechanism https://caddyserver.com/docs/caddyfile/directives/reverse_proxy#aaaaa
alexbalandi
alexbalandi9mo ago
oh, that looks like exactly what i need, ty
Brody
Brody9mo ago
I don't know what load balancer mechanism would work for your usecase, but I'd definitely recommend using caddy
alexbalandi
alexbalandi9mo ago
there's probably some docs on it, right?
Brody
Brody9mo ago
docs on what exactly? railways docs are generally high level only, they don't go into the nitty gritty of how stuff like dns works
alexbalandi
alexbalandi9mo ago
ye, i see that the root page has api description, this should be enough for me https://utilities.up.railway.app/ But let's say I have internal service
Brody
Brody9mo ago
for transparency sake, that is my service and it is not affiliated with railway
alexbalandi
alexbalandi9mo ago
so https://utilities.up.railway.app/dns-lookup?value=personalized_assistant_core.railway.internal&type=ip should work when i do this request from container on internal network, right? oh 😅 do you have the github for it? i'd look up the code if you don't mind
Brody
Brody9mo ago
that request will not work, private networks are scoped to the project and environment they are in, your service does not exist within my project so therefore you wouldn't be able to run a lookup on your private domains nope, always have just deployed it from cli
alexbalandi
alexbalandi9mo ago
btw, it seems nginx has "sticky" sessions almost out-of-the-box https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/
HTTP Load Balancing
Load balance HTTP traffic across web or application server groups, with several algorithms and advanced features like slow-start and session persistence.
Brody
Brody9mo ago
do they support dynamically pulling services from an AAAA lookup? you'd need a proxy that supports that, or else you would be stuck running duplicate services yourself caddy supports both sticky upstreams and pulling upstreams dynamically from an AAAA lookup
alexbalandi
alexbalandi9mo ago
Okay, i'll look at them both, ty!
Brody
Brody9mo ago
you've peaked my interest though, now I kinda wanna roll a lb myself
alexbalandi
alexbalandi9mo ago
hehe I mean, your utility service is like half the work done tbh
Brody
Brody9mo ago
how so the IP type lookup just does a parallel dns lookup for both A and AAAA types, nothing fancy railway's internal dns server is the one responding with all the replicas ips
alexbalandi
alexbalandi9mo ago
i mean with your /replica and /dns_lookup you can just implement the db as i mentioned, then it's just a matter of reading/writing from this db, which has replicas associated with user, no? like that's the straightforward solution i'm thinking of rn Then you just send your requests with db's help
Brody
Brody9mo ago
/replica just returns the RAILWAY_REPLICA_ID environment variable, nothing special
alexbalandi
alexbalandi9mo ago
along with the requests ip you say there
Brody
Brody9mo ago
yeah that's your ip
alexbalandi
alexbalandi9mo ago
ah I thought it captures the replica's one well, tbh, you could just update the code for replicas to populate db themselves. and send cookie in request to replica
Brody
Brody9mo ago
sounds confusing
alexbalandi
alexbalandi9mo ago
replica recieves cookie -> stores cookie as key in db, its own ip and port on internal network as value actually. isn't it the easiest brainded solution?
Brody
Brody9mo ago
my solution is just write efficient code so I will never need more than one replica
alexbalandi
alexbalandi9mo ago
NinoFace last words right there!
Brody
Brody9mo ago
true