Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, but if you can run a sufficient LLM on a $2,000 laptop, then the cost to serve it from the cloud will be similarly cheap. (e.g. reserve an appropriately sized EC2 instance for pennies on the dollar)

It's a highly competitive market. Companies aren't going to pay 100k/year to run a model on something that can run on a 2k consumer grade device.

128GB of gpu accessible/fast RAM can be had for $5000 on a macbook pro today. What will it be 3-4 years from now on linux/windows machines?

And we still haven't seen any SoC providers try to optimize for RAM capacity over compute yet.



Oh yes, I could definitely see the privacy-preserving consumer use case creating sufficient demand for efficiency that also bleeds over into the enterprise market.

That's what's happened with power efficiency and ARM CPUs, after all!


Not sure what you mean: https://aws.amazon.com/ec2/graviton/

Not to speak of managed cloud services that run on ARM under-the-hood/behind the scenes.

Of course ARM isn't inherently cheaper, AMD+Intel could cut prices/margins big and probably be competitive on $/perf


That's what I mean: ARM was initially attractive in the low-power/low-performance market, then it scaled up to higher and higher power cores (while still staying power efficient), which in turn attracted datacenter customers.


This is where I want highly sensitive healthcare consumers of LLMs to be at. Note summation, suggested diagnosis (provider always in control), and other augmented abilities for the clinical staff without the risk of health care data sent outside the device, or the very local network.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: