> In practice, however, this is more subtle -- what if there is a netsplit, one ...

coffeemug · on July 16, 2015

> That's the particular case that I was curious about, that solution makes a lot of sense. I suppose that would also mean that if an outage takes out more than half of your replicas then you will loose all write-abilities

That's correct. We've built in an "emergency repair" provisions into the product to handle cases like this, should they happen, but that requires manual intervention. (In general, if you lose more than half of your servers, you want to intervene manually anyway)

> Do these write requests to nodes that have no route to a primary simply "hang", or are they rejected by the daemon?

They time out via normal TCP mechanisms and get rejected.

chrisfosterelli · on July 16, 2015

Sounds like you guys are taking a pretty reasonable approach to all this, thanks for letting me pick your brain!