Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The point I'm making is that neither Bazel nor Nix do that. However, sandboxing is still relevant, because if you still have impurities leaking from outside the closure of the build, you have bigger fish to fry than non-deterministic builds.

That all said, in practice, many of the cases where Nixpkgs builds are not deterministic are actually fairly trivial. Despite not being a specific goal necessarily, compilers are more deterministic than not, and in practice the sources of non-determinism are fewer than you'd think. Case in point, I'm pretty sure the vast majority of Nixpkgs packages that are bit-for-bit reproducible just kind of are by accident, because nothing in the build is actually non-deterministic. Many of the cases of non-deterministic builds are fairly trivial, such as things just linking in different orders depending on scheduling.

Running everything under a deterministic VM would probably be too slow and/or cumbersome, so I think Nix is the best it's going to get.



Sandboxing is relevant, but nix does that by default, so no difference here.

Nonetheless, I agree that Nix does the optimum here, full-on emulation would be prohibitively expensive.


You know, though, it would probably be possible to develop a pretty fast "deterministic" execution environment if you just limit execution to a single thread, still not resorting to full emulation. You'd still have to deal with differences between CPUs, but it would probably not be nearly as big of an issue. And it would slow down builds, but on the other hand, you can do a lot of them in parallel. This could be pretty interesting if you combined it with trying to integrate build system output directly into the Nix DAG, because then you could get back some of the intra-build parallelism, too. Wouldn't be applicable for Nixpkgs since it would require ugly IFD hacks, but might be interesting for a development setup.

Perhaps it's an area worth researching.


I don't know, concurrency is still on the table, especially that the OS also has timing events.

Say, a compiler uses multiple threads and even if you assign some fixed amount of fuel to each thread, and mandate that after n instructions, thread-2 must follow for another n, how would that work with, say, a kernel interrupt? Would that be emulated completely only at given fixed times?

But I do like the idea of running multiple builds in parallel to not take as big of a hit from single-threaded builds, though it would only increase throughput not latency.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: