if one safekeeper crashes without a way to recover, humans are still needed?
https://neon.tech/blog/paxos, said
If the safekeeper is up again, it can join the cluster automatically. But if it crashes without a way to recover, we need to change cluster membership. Right now, such a change requires humans to be in the loop to ensure that the old safekeeper is actually down. It is on our roadmap to automate this procedure.was it done? Thanks.
4 Replies
modern-teal•6mo ago
Automatic handling of Safekeeper failure is currently being worked on, but losing a Safekeeper has no impact on serice availability or performance. Writes continue to the other Safekeepers.
conscious-sapphireOP•6mo ago
but we have to add one new Safekeeper back to keep safety, is that automated?
modern-teal•6mo ago
Adding a new Safekeeper back is not automated yet, but it's being worked on now to remove the manual handling process.
conscious-sapphireOP•6mo ago