Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

WASM isn't quite deterministic. An easy example is NaN propagation, which can be nondeterministic in certain circumstances. Obelisk itself seems to allow nondeterminism via the sleep() function. Just create a race condition among a join set. I imagine that might even get easier once the TODO to implement sleep jitter is completed.

It's certainly close enough that calling it deterministic isn't misleading (though I'd stop short of "true determinism"), but there's still sharp edges here with things like hashmaps (e.g. by recompiling: https://dev.to/gnunicorn/hunting-down-a-non-determinism-bug-...).



Thanks for bringing that up. Regarding the NaN canonicalization, there is a flag for it in wasmtime [1], I should probably make sure it is turned on.

Although I don't expect to be an issue practically speaking, Obelisk checks that the replay is deterministic and fails the workflow when an unexpected event is triggered. It should be also be possible to add an automatic replay of each finished execution to verify the determinism e.g. while testing.

[1] https://docs.rs/wasmtime/latest/wasmtime/struct.Config.html#...

Edit: Enabling the flags here: https://github.com/obeli-sk/obelisk/pull/67


> Just create a race condition among a join set.

All responses and completed delays are stored in a table with an auto-incremented id, so the `-await-next` will always resolve to the same value.

As you mention, putting a persistent sleep and a child execution into the same join set is not yet implemented.


I get that, the nondeterminism would come from the completion order of the join set. If the children sleep appropriately, they'll race to be inserted after completing, and order of the result set will depend on the specifics of the implementation. It's possible this could happen deterministically, but probably not reasonably.


Sorry for the late reply.

The actual order in which child workflows finish and their results hit the persistence layer is indeed nondeterministic in real-time execution. Trying to force deterministic completion order would likely be complex and defeat the purpose of parallelism, as you noted.

However, this external nondeterminism is outside the scope of the workflow execution's determinism required for replay.

When the workflow replays, it doesn't re-run the race. It consumes events from the log. The `-await-next` operation during replay simply reads the next recorded result, based on the fixed order. Since the log provides the same sequence of results every time, the workflow's internal logic proceeds identically, making the same decisions based on that recorded history.

Determinism is maintained within the replay context by reading the persisted, ordered outcomes of those nondeterministic races.


> An easy example is NaN propagation, which can be nondeterministic in certain circumstances.

Which circumstances?


See for example

https://github.com/WebAssembly/design/issues/1463

In general, if NaN1 and NaN2 are different (there are 23 bits in a NaN that can be set to an arbitrary value, the NaN payload) then combining them isn’t deterministic. NaN1 + NaN2 might produce NaN1 or NaN2 depending on the processor model, instructions etc. IEEE754 only says that the result must be one of the two input NaNs (or at least it did last time I checked the standard.)

In practice, NaN payloads are seldomly used so it doesn’t matter much. NaN canonicalization involves transforming all NaNs to specific NaN values, and AFAIK it’s expensive because technically you need to check for NaN all the time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: