Hacker Newsnew | past | comments | ask | show | jobs | submit | more latchkey's commentslogin

Spectral just did a thread on that.

https://x.com/SpectralCom/status/1993289178130661838


> CUDA is that level of abstraction, but it only really works for Nvidia cards.

There are people actively working on that.

https://scale-lang.com/


ProTip: `use bun`

Funny that this is getting downvoted, but it installs dependencies super fast, and has the same approval feature as pnmp, all in a simple binary.


This is like saying "use MacOS and you won't get viruses" in the 2000s


Bun disables post-install scripts by default and one can explicitly opt-in to trusting dependencies in the package.json file. One can also delay installing updated dependencies through keys like `minimumReleaseAge`. Bun is a drop-in replacement for the npm CLI and, unlike pnpm, has goals beyond performance and storage efficiency.

Not sure what your analogy is trying to imply.


Which was for the most part true.


The suggestion was to use pnpm, and I'm suggesting something I prefer more than pnpm.


Except trying it out takes a minute and costs nothing.


"Rewrite it in rust"


It was like that last year too.


A bit of background. This is directed towards Spectral Compute (Michael) and https://scale-lang.com/. I know both of these guys personally and consider them both good friends, so you have to understand a bit of the background in order to really dive into this.

My take on it is fairly well summed up at the bottom of Elio's post. In essence, Elio is taking the view of "we would never use scale-lang for llms because we have a product that is native AMD" and Michael is taking the view of "there is a ton of CUDA code out there that isn't just AI and we can help move those people over to AMD... oh and by the way, we actually do know what we are doing, and we think we have a good chance at making this perform."

At the end of the day, both companies (my friends) are trying to make AMD a viable solution in a world dominated by an ever growing monopoly. Stepping back a bit and looking at the larger picture, I feel this is fantastic and want to support both of them in their efforts.


Just to clarify: this post was not written against Spectral Compute. Their recent investment news was the trigger for us to finally write it yes, but the idea has been on our minds for a long time.

We actually think solutions like theirs are good for the ecosystem, they make it easier for people to at least try AMD without throwing away their CUDA code.

Our point is simply this: if you want top-end performance (big LLMs, specific floating point support, serious throughput/latency), translation alone is not enough. At that point you have to focus on hardware-specific tuning: CDNA kernel shapes, MFMA GEMMs, ROCm-specific attention/TP, KV-cache, etc.

That’s the layer we work on: we don’t replace people’s engines, we just push the AMD hardware as hard as it can go.


Kimi is the latest model that isn't running correctly on AMD. Apparently close to Deepseek in design, but different enough that it just doesn't work.

It isn't just the model, it is the engine to run it. From what I understand this model works with sglang, but not with vLLM.


This is normal. An inference engine needs support for a model's particular implementation of the transformer architecture. This has been true for almost every model release since we got local weights.

Really good model providers send a launch-day patch to llama.cpp and vllm to make sure people can run their model instantly.


It isn't about normal or not. It is that those patches are done for Nvidia, but not AMD. It is that it takes time and energy to vet them and merge them into those projects. Kimi has been out for 3 months now and it still doesn't run out of the box on vLLM on AMD, but it works just fine with Nvidia.


I figured out how to PXE boot 20,000 PS5 APU blades (BC-250) during covid when I couldn't even get to the actual hardware. Great fun.


I don't even think about it. If available, I take it from near where I park and I return it to the front of the store with the rest of the carts. The little tiny bit of extra exercise is nice to clear my head before I start driving.


When it becomes a habit, good deeds become effortless. I wish I could say that about most people.


How would you prepare?


I think the first step is to develop a "we're not Texas" culture. Observe the ways in which Texas is ruining its environment and deliberately, conspicuously do something else.

For example, the aquifer situation in the Central Valley of California is in some ways similar to Ogallala aquifer in Texas. "If we don't want to end up like Texas, we need to get a handle on this." Enact laws and conservation measures which make it difficult for those coming from out of state to bring their ecologically irresponsible practices with them. Ideally, reduce the ecological impact wrought by well-established California interests as well, but if necessary grandfather them in in order to prepare.


In SE Asia, tankless is generally placed closer to where it comes out.

Recirculation pump means that you're paying to keep your pipes from getting cold.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: