Just yesterday I was helping with an incident and we had outlined two possible solutions. The first was bad (do nothing, leave things in a somewhat bad state), the second was a ton of work (including restoring from snapshot). I was able to describe these solutions as a spectrum.
1. Do nothing, bad state remains
2. Surgically restore to world before bad state
3. Remove all bad state (even if it wasn't caused by the incident)
I'm omitting some details, but I think this strategy can be effective.
How configurable should X be? We have one customer who wants it to be configured like this. All other customers want it configured like that.
We can make X maximally configurable which would allow future customers to have a custom setup - how likely is that?
1. Do nothing, bad state remains 2. Surgically restore to world before bad state 3. Remove all bad state (even if it wasn't caused by the incident)
I'm omitting some details, but I think this strategy can be effective.
How configurable should X be? We have one customer who wants it to be configured like this. All other customers want it configured like that.
We can make X maximally configurable which would allow future customers to have a custom setup - how likely is that?
If it's not likely, here's an alternative...