Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting. But it would depend on how much of model X is salvaged in creating model X+1.

I suspect that the answer is almost all of the training data, and none of the weights (because the new model has a different architecture, rather than some new pieces bolted on to the existing architecture).

So then the question becomes, what is the relative cost of the training data vs. actually training to derive the weights? I don't know the answer to that; can anyone give a definitive answer?



There are some transferable assets but the challenge is the commoditization of everything that means others have easy access to “good enough” assets to build upon. There’s very little moat to build in this business and that’s making all the money dumped into it looking a bit froth and ready to implode.

GPT-5 is a bellwether there. OpenAI had a huge head start and basically access to whatever money and resources they needed and after a ton of hype released a pile of underwhelming meh. With the pace of advances slowing rapidly the pressure will be on to make money from what’s there now (which is well short of what the hype had promised).

In the language of Gartner’s hype curve, we’re about to rapidly fall into the “trough of disillusionment.”




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: