Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

but.. they're using a pretty big neural net themselves (NNEU) as far as I can tell? With datasets of hundreds of Gigs. Doesn't this remove the significance of the matchup, that was supposed to be about Deep Learning vs more traditional chess bot methods?

Claiming that something that uses a neural net trained on hundreds of gigs of data isn't deep learning .. I mean it's possible, I don't know the details.

What is it about now, open vs closed source? Different methods of deep learning and big data fighting? (Both of these are also interesting ofc)



NNUE is much smaller than Leela's net, and has a much different architecture that's optimized more for CPU. Additionally, Leela uses Monte Carlo Tree Search and Stockfish uses Alpha/Beta pruning.


They say that the dataset is hundreds of Gigs worths of games, so the net must still be really pretty big.

Though definitely not directly comparable, dataset of GPT2-xl is 8 million web-pages. What I mean to say is that this is clearly deep learning.


> They say that the dataset is hundreds of Gigs worths of games, so the net must still be really pretty big.

This isn't true. The size of the training data doesn't imply anything about the size of the neural network.

In the case of Stockfish, the NN is quite shallow, and implemented using a custom framework designed to to run fast on CPUs.

See https://news.ycombinator.com/item?id=26746160 for previous commentary on this.

> Though definitely not directly comparable, dataset of GPT2-xl is 8 million web-pages.

This is irrelevant. You can train GPT3 on a smaller dataset, or a smaller model on the same dataset as GPT3.

> What I mean to say is that this is clearly deep learning.

It's been clear that neural network models are superior since Alpha Go. There's not "Deep Learning vs <something else>" anymore because the <something else> isn't competitive and no one is really working on it.


Its actually really small, mostly because bigger networks take longer to evaluate which slows down the search making it shallower and ending in a less clever algorithm.


Are you involved in the project ? Can I ask what your source is? Great if it's the case.


NNUE is a 4 layer (1 input + 3 dense) integer only neural network.

It's just over 82,000 parameters.[1] That's a very shallow, small NN - by comparison something like EfficientNet-B1[2] is 7.8M parameters, and that's considered a small network.

[1] https://www.chessprogramming.org/Stockfish_NNUE#NNUE_Structu...

[2] https://proceedings.mlr.press/v97/tan19a/tan19a.pdf


I am involved in lc0 development and fairly aware of SF dev. NNUE is a very small (3 layer dense) cpu only net.


Size of training set is not enough to make it deep learning, right? Doesn't deep learning imply at least one hidden layer?


Are you saying you read that it didn't have a hidden layer?

My point is that having such a huge dataset would not be extremely useful without using a deep neural net (of at least one hidden layer)


NNs without at least one hidden layer are rarely used.


They're used all the time, we just call it logistic regression.


You can have a relatively small model and still benefit from using a gigantic training set to train the model.


You can find out more about Stockfish's NNUE architecture and internal working here:

https://github.com/glinscott/nnue-pytorch/blob/master/docs/n...

It's pretty interesting read.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: