Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Deep" is such a good prefix for all sorts of Deep Learning, Deep Math, Deep Thinking, Deep Engineering etc. Wonder if the networks were originally called Thick neural networks, would the ML/AI revolution as we know it still happened?


Whilst I agree with the general sentiment, in this particular instance it has to do with the depth of network that could be trained efficiently thanks to hardware advances. LeNet was 7 layers deep, Dan's 9, VGG's 13, GoogleNet's 22, etc.

There is theory w.r.t to thick networks as well (e.g the link to Gaussian processes require infinite width).

Deep makes sense here.


Well except that most neural network are not deep, they have a very low number of layers but each layer can be tremendously wide. This should have been called wide learning. But we could imagine some learning algorithm that exploit more depth than wideness. A more correct naming would take into account both dimensions: depth and wideness.

Note that this is hortogonal to sparsedness vs density


The depth seems to matter more than the width, at least as long as the layers are sufficiently wide. In fact, in the limit that the layer becomes infinitely wide, you just end up with a Gaussian process. In practice a width of ~100--1000 is sufficient to get behavior that is pretty close to a Gaussian process, so in general doubling the width of a layer doesn't gain you all that much compared to using those parameters for an extra layer. The real representational power seems to come from increasing depth.


Around the time the phrase "deep learning" came into vogue, the advances were indeed in training deeper networks, not wider. Later on it turned out that shallow wide networks are sufficient for many problems. (Also, it turned out the pre-training tricks that people came up with for training deep networks weren't really necessary either.)


It's also important to note that they work despite being wide, you can see that with the efficiency of pruning, and ideas such as the lottery ticket hypothesis that state that "successful" sub-networks within the wide network account for most of the performance.

In the theory literature, if you have a K-deep network, K=1 is the shallow case, K>1 is deep. Agreed naming could be better, but it's not like "deep work" or "deep thoughts" as the parent was stating.


The adjective "deep" came from deep belief networks, which are a variation on restricted boltzmann machines. RBNs have one visible and one hidden layers, DNBs have more hidden layers - hence "deep". So it's not exactly based on a distinction between "deep" and "shallow" models.


I dunno, in the resnet age, many and perhaps most networks are 20+ layers. I feel like the shallowest networks I see these days are RNNs being used for fast on device ML, which trends not to be terribly wide due to the same hardware constraints.


Of all the nouns available (deep, dense, thick, big, condensed, etc.), "deep" is def. the on that brings most hype - makes it sound very advanced, and more marketable.

Deep Learning with Big Data

That stuff sells itself.


If thick neural networks would indeed exist they would probably be a special case of something Schmidhuber invented 30 years ago.


> If thick neural networks would indeed exist they would probably be a special case of something Schmidhuber invented 30 years ago.

Heh. A burn both pointed and subtle. A+.


it's the the new e-


> ....Thick neural networks, would the ML/AI revolution as we know it still happened?

Cancel culture will ensure that the conference doesnt happen in the name of fat shaming.


Finger on the pulse, my man - people HATE being called thick these days.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: