Questions about Safekeeper Replacement and Self-Hosting
Hello,
I'm currently exploring Neon Self-Hosting and came across your blog post below, which was very helpful:
https://neon.com/blog/paxos#safekeeper-failure
I have a few questions:
1. In case a safekeeper node experiences an unrecoverable failure, it needs to be replaced.
When starting a compute node using compute_ctl, we provide safekeeper_connstrings during initialization.
Is there a way to update safekeeper_connstrings after the compute node has already been started?
2. According to the blog post, you mention that automating safekeeper replacement is on the roadmap.
Would this kind of automation not be achievable simply by launching a new safekeeper through the Control Plane (e.g. Kubernetes), or are there additional complexities involved?
Thank you very much!
Neon
Why does Neon use Paxos instead of Raft, and what's the difference?...
TLDR: Neon separates storage and compute, substituting the PostgreSQL persistency layer with a custom-made distributed storage written in Rust. Due to this separation, some nodes don’t have persistent disks. The original Raft paper works only with uniform nodes, but Paxos variants support proposers without storage and allow the acceptors to be...
1 Reply
other-emerald•4mo ago
Hi @levinism if you don't get an answer here, please try the #self-hosted channel for this question.