Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> In practice, however, this is more subtle -- what if there is a netsplit, one side maintains its primary, and the other side elects a new one? What if the user writes to both primaries on either side? What happens after the cluster rejoins? In this specific case, we solve this by requiring the majority of the replicas to acknowledge writes by default before the write acknowledgement is sent to the client.

That's the particular case that I was curious about, that solution makes a lot of sense. I suppose that would also mean that if an outage takes out more than half of your replicas then you will loose all write-abilities (unless I'm mistaken), but that's probably better than the potential mess that could ensue.

Do these write requests to nodes that have no route to a primary simply "hang", or are they rejected by the daemon?



> That's the particular case that I was curious about, that solution makes a lot of sense. I suppose that would also mean that if an outage takes out more than half of your replicas then you will loose all write-abilities

That's correct. We've built in an "emergency repair" provisions into the product to handle cases like this, should they happen, but that requires manual intervention. (In general, if you lose more than half of your servers, you want to intervene manually anyway)

> Do these write requests to nodes that have no route to a primary simply "hang", or are they rejected by the daemon?

They time out via normal TCP mechanisms and get rejected.


Sounds like you guys are taking a pretty reasonable approach to all this, thanks for letting me pick your brain!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: