I wouldn't call 5-level-nested code "surprisingly clean", and continuations are cursed. I wouldn't want to have to debug tests that relied on continuations unnecessarily.
The code doesn't have to be nested, it can be factored into methods. Additionally, these aren't technically continuations in the sense of the "continuation passing style" popular in web programming, so don't confer the same drawbacks. Tests are essentially organised as a pipeline of functions which gets run, it's just that the pipeline has forks in it where the data flows along both forks.