Bun disables post-install scripts by default and one can explicitly opt-in to trusting dependencies in the package.json file. One can also delay installing updated dependencies through keys like `minimumReleaseAge`. Bun is a drop-in replacement for the npm CLI and, unlike pnpm, has goals beyond performance and storage efficiency.
A bit of background. This is directed towards Spectral Compute (Michael) and https://scale-lang.com/. I know both of these guys personally and consider them both good friends, so you have to understand a bit of the background in order to really dive into this.
My take on it is fairly well summed up at the bottom of Elio's post. In essence, Elio is taking the view of "we would never use scale-lang for llms because we have a product that is native AMD" and Michael is taking the view of "there is a ton of CUDA code out there that isn't just AI and we can help move those people over to AMD... oh and by the way, we actually do know what we are doing, and we think we have a good chance at making this perform."
At the end of the day, both companies (my friends) are trying to make AMD a viable solution in a world dominated by an ever growing monopoly. Stepping back a bit and looking at the larger picture, I feel this is fantastic and want to support both of them in their efforts.
Just to clarify: this post was not written against Spectral Compute.
Their recent investment news was the trigger for us to finally write it yes, but the idea has been on our minds for a long time.
We actually think solutions like theirs are good for the ecosystem, they make it easier for people to at least try AMD without throwing away their CUDA code.
Our point is simply this: if you want top-end performance (big LLMs, specific floating point support, serious throughput/latency), translation alone is not enough. At that point you have to focus on hardware-specific tuning: CDNA kernel shapes, MFMA GEMMs, ROCm-specific attention/TP, KV-cache, etc.
That’s the layer we work on: we don’t replace people’s engines, we just push the AMD hardware as hard as it can go.
This is normal. An inference engine needs support for a model's particular implementation of the transformer architecture. This has been true for almost every model release since we got local weights.
Really good model providers send a launch-day patch to llama.cpp and vllm to make sure people can run their model instantly.
It isn't about normal or not. It is that those patches are done for Nvidia, but not AMD. It is that it takes time and energy to vet them and merge them into those projects. Kimi has been out for 3 months now and it still doesn't run out of the box on vLLM on AMD, but it works just fine with Nvidia.
I don't even think about it. If available, I take it from near where I park and I return it to the front of the store with the rest of the carts. The little tiny bit of extra exercise is nice to clear my head before I start driving.
I think the first step is to develop a "we're not Texas" culture. Observe the ways in which Texas is ruining its environment and deliberately, conspicuously do something else.
For example, the aquifer situation in the Central Valley of California is in some ways similar to Ogallala aquifer in Texas. "If we don't want to end up like Texas, we need to get a handle on this." Enact laws and conservation measures which make it difficult for those coming from out of state to bring their ecologically irresponsible practices with them. Ideally, reduce the ecological impact wrought by well-established California interests as well, but if necessary grandfather them in in order to prepare.
https://x.com/SpectralCom/status/1993289178130661838