There’s a difference between raw numbers on paper and actual real world differences when training frontier models.
There’s a reason no frontier lab using AMD models for training, because the raw benchmarks for performance for a single chip for a single operation type don’t translate to performance during an actual full training run.
Meta, in particular, is heavily using AMDs for inference training.
Also, anyone doing very large models tend to prefer AMDs because they have 288GB per chip and outperform for very large models.
Outside of these use cases, it’s a toss up.
AMD is also much more aligned with the supercomputing (HPC) world were they are dominant (AMD cpus and GPUs power around 140 of the top 500 HPC systems and 8 of the top 10 most energy efficient)
Vscode stuff is ok too if you dislike big and featureful ides, but it’s behind rider.
dotnet cli to run commands is sufficient until you have very strange builds.
Keep in mind the language has many ways to do the same thing, so rider helps you doing the “modern” things. The base class library is also very vast. Take your time, C# is great but it has a ton of features.
I think it's somewhat of a stretch to say our "base model" had billions of years of evolution. Billions of years ago, mammals didn't exist and the only things around were more like plankton or algae and had nothing like a "base model" we could say we somehow inherited. The earliest ape-like creatures appeared around 10M years ago.
The first mammals appeared around 225M years ago so you could potentially argue that our "base model" first started evolving around then but I still think it's something of a stretch to compare this kind "training" to the ways we are training modern neural networks. The "base model" at this time was simply survival, eat, reproduce, survive, and enough circuitry to manage your base biological functions.
We are essentially running the entire volume of human knowledge through a neural network through billions of iterations and the model itself has 175 billion plus parameters. Humans nor any of our evolutionary ancestors never received this kind of "training", it's simply not comparable at all. Our mammalian ancestors were exposed to "basic" natural environments, they were not "pushed" into artifical situations to learn tool usage or language.
If we look at when apes first came about (10M years) ago and let's say since then the average ape or humanoid lived to 30-40 years, and estimate the average generation length for apes at 20 years (which is roughly accurate according to the latest research). This means that since the first recorded apes there have been about 500'000 generations of apes and humans. (12'000 generations for humans only).
So now if you compare how we are training our models, GPT-3 at 175B parameters and billions of iterations of training, GPT-4 we don't know. And again, extremely focused and specific training, feeding the entire human generated corpus of language, mathemetics, logic, etc etc into it, and we get something that does pretty well at human language.
Humans have a "base model" as you put it which really hasn't been trained for many generations and has been mostly exposed at random to external stimuli in an ad-hoc, unfocused way, and no single individual has ever been exposed to even a fraction of a fraction of a percent as much stimuli as a GPT model. So there is something different going on with our brain and neural networks and I think it can't really be compared at all: the mechanisms, numbers, and crucially, the results, do not match up in the slightest.
I was being slightly facetious in parts. The point I was driving at is that a human mind has had a very long time to do something perhaps akin to hyperparameter tuning, and we know that even imperceptibly minor changes in brain architecture can be the difference between struggling to put your socks on and being a genius that’ll be remembered through the ages. So those 500M years since the first neuron can’t just be written off.
Ultimately I agree with your final conclusion. You can’t really compare a LLM and evolved human directly. Even just a neuron in an ANN is nothing remotely comparable to a biological neuron. Of course it isn’t surprising that humans and LLMs are different given that they are built to do completely different things on fundamentally different hardware.
It just seems like many people are keen to write off the significance of GPT just because it’s not yet quite as good at everything compared to the world’s most marvellous example of engineering we all have in our skull. We didn’t even have transistors 75 years ago, but now we have a pretty believable facsimile (until you really interrogate it) of human intelligence that’s improving million times faster than evolution was ever capable of. But now the criticism is that it learns in a fundamentally different way to humans and doesn’t generalise fast enough. It’s true, but.. really?
I wouldn't go as far as saying it's a "toy" for them, I know some engineers there and someone who joined recently who spent significant time learning Elixir before joining.
I think my preference is Northrop Frye’s analysis in “Anatomy of Criticism”, his categories of “mythic”, “romantic”, “high mimetic”, “low mimetic”, “ironic” are particularly useful for analyzing the history of literature from mythic legends and epic poetry up to modern literature and fantasy.
Although this analysis isn’t so much for general plot structure as much as for looking at characters and particularly the main protagonist and their relationship to other characters and the environment of the novel.
I tried this and it seemed to break ChatGPT, it blurted out something which made no sense and then offered to regenerate it. How is it supposed to work?
Maybe it only works with gpt-4? Is there a sample transcript of what it can do anywhere, I still don't quite understand what the end-goal is supposed to be. It seems to simulate mixture of experts technique, but I thought gpt-4 already did that.
Also, in the original stories, Robin actually was more like rob from the Church and give to the gentry. He was very friendly with the "right sort" of noblemen, he just really didn't like rich priests from the church. So we (in our modern age) have kind of perverted the idea of Robin Hood to fit our times but he never really was some kind of hero for the poor.
One interesting point here is that humans can learn to do things without language at all. If you raised a human baby and never exposed it to any language it would still learn skills and behaviors. So while intelligence and reasoning in humans does still seem to be linked to language it’s not quite as simple as all knowledge and reasoning simply being encoded in language.
Whereas (obviously) ChatGPT is completely based on language and can’t do anything without language or anything that isn’t derived directly from language.
In college I tried a medication called Topamax (Topiramate) for migraine prevention. Topamax has a low-occurrence side effect of “language impairment” which I was particularly sensitive to.
It was a terrifying experience, but it was also a valuable one as it changed the way I view intelligence.
10 days in my writing and speech skills had devolved to that of a primary school aged child. I tried to type a text message and struggled to come up with simple words like “and”. My speech slowed down considerably.
The terrifying thing was that my internal world was still as complex and meaningful as it was before. All of the emotions I felt were real and legitimate. My cognition outside of lqnguage was intact, I could do math just fine and conceptualize and abstract problems.
In spite of this, I was unable to convey what I was feeling and thinking to the outside world. It felt like I was trapped inside my own body.
After recovering, my intuitive understanding of the link between language and intelligence was changed forever.
I do not experience an internal monologue (unless I make a conscious effort at it), and I also struggle to find the right words for things. For this, and for a few other reasons (like taking a longer time than usual to understand speech), I suspect I may have a mild language processing disorder.
But my inner world is still extremely rich. I have very complex thoughts, can solve complex puzzles, etc., all without thinking any words.
It sounds like your medication gave you a particularly intense experience of this.
Many people have historically downplayed the consciousness of animals and babies on account of their inability to understand language, but it seems language likely has nothing to do with consciousness.
So apparently I'm in a HN minority (given the current comments) given that I absolutely love this single and in fact it almost immediately made me cry hearing John's voice and those lyrics. The song made me think about the passing of time, my own aging, and the people I've lost along the way.
I think it's a beautiful song and I think watching the "Making Of" video made the song more impactful for me, making me think of those four close friends and now with only two of them left... "Now and Then"
This song has a more…”distant” quality to the sound. It doesn’t seem (to me) like the other songs from before. The lyrics seem a bit too short in comparison too. But it’s haunting enough for me to listen to it several more times.
I was recently listening to a few of those melancholic or nostalgic songs because they were different yet similar to their original styles — The Fool on the Hill, Nowhere Man, Eleanor Rigby, Norwegian Wood, Girl, Michelle, Strawberry Fields Forever, In My Life, and Yesterday.
Edit: I hope deepfakes and AI don’t create more Beatles (or Beatles-like) songs. It’s enough to have what we have and be contented with those.
There’s a reason no frontier lab using AMD models for training, because the raw benchmarks for performance for a single chip for a single operation type don’t translate to performance during an actual full training run.