Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, they aren’t publishing all their secret sauce. For example, we have no idea how their baseline model was trained. They’ve not said anything about the data or code relating to this training. They have talked about some of the optimization techniques they’ve used in arriving at their final models that they released weights for, but their claims on cost seem suspicious because we don’t know what prior work they built on. I’ve seen many people sharing evidence that DeepSeek’s models seem to think they are OpenAI models, which supports the theory that DeepSeek first built a baseline trained off the outputs of other models. DeepSeek also likely has a much larger number of GPUs than what they’ve admitted, perhaps to avoid attention on their suppliers who may have violated sanctions.


The number of GPUs they have (which may well be export-legal H800's as NVidia believe they are) goes hand in hand with the amount it cost to train (however you define that), and is something people trying to replicate their approach can verify (or not).

It seems obvious that you need to have a model trained, or fine-tuned, on some reasoning data (with backtracking etc) such that reasoning behavior is part of it's repertoire, before you can use RL to hopefully get it to use such reasoning pursuant to whatever goals you are setting. I'd not be surprised if they used O1 outputs to bootstrap the model in this way, although O1's reasoning traces are a deliberate obfuscation of what it is really doing (an after-the-fact summary) so even if this is the case that should be borne in mind!

OTOH, while reasoning data may be scarce in the wild, it's presumably not entirely unavailable, and/or DeepSeek may have created some themselves, so who knows what mix DeepSeek used for this initial bootstrapping stage. As you say, this aspect remains as "secret sauce".

Of course once they've got their first stage model trained they then use that to generate data for the second/final stage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: