Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

it would be interesting to know how stolon/patroni deal with the failover edge cases and how this impacts availability. like if you accessing the DB but can't contact etcd/consul then you should stop accessing the DB because you might start doing unsafe writes. but this means that consul/etcd is now a point of failure (though, this usually runs multiple nodes so shouldn't happen!). but you can end up in a situation where bugs/issues with the HA system ends up causing you more downtime than manual failover would cause.

you also have to be careful with ensuring there is sufficient time gaps when failing over to cover the case when the master is not really down and connections are still writing to it. like the patroni default haproxy config doesn't even seem to kill live connections which seems kind of risky.



> if you accessing the DB but can't contact etcd/consul then you should stop accessing the DB because you might start doing unsafe writes.

If patroni can't update leader lock in Etcd, it will restart postgres in read-only mode. No writes will happen.

> like the patroni default haproxy config doesn't even seem to kill live connections which seems kind of risky.

That's not true: https://github.com/zalando/patroni/blob/master/haproxy.cfg#L...


Ah. Thanks. I was looking at the template files but I guess that is not used or used for something else.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: