Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was talking to my (12 year old) son about parts of math he finds boring. He said that he thinks absolute value is absurdly easy and extremely boring. I asked him if there was anything that might make it more interesting, he said "maybe complex numbers".

So I asked him "what would the absolute value of i+1 be?" he thinks for a little bit and says "square root of 2" and I ask him "what about the absolute value of 2i + 2?" "square root of 8"

I ask him "why?" and he said "absolute value is distance; in the complex plane the absolute value is the hypotenuse of the imaginary and real numbers."

So -- first of all, this was a little surprising to me that he'd thought about this sort of thing having mostly just watched youtube videos about math, and second, this sort of understanding is a result of some manner of understanding the underlying mechanisms and not a result of just having a huge dictionary of synonyms.

To what degree can these large language models arrive at these same conclusions, and by what process?



> this sort of understanding is a result of some manner of understanding the underlying mechanisms and not a result of just having a huge dictionary of synonyms.

He developed an understanding of the underlying mechanisms because he correlated concepts between algebraic and geometric domains, ie. multimodal training data. Multimodal models are already known to be meaningfully better than unimodal ones. We've barely scratched the surface of multimodal training.


First YouTube video that hit for "absolute value of complex" numbers says within 30 seconds that you have to take the 2 numbers, square them and add them and the result is square root of that. I doubt he had to come up with that on his own.


The child clearly demonstrated a geometric, rather than formulaic, understanding of the problem


I imagine that was shown in the YouTube video visually? That it's a hypotenuse like he explained and this is how to calculate it. I'm just not seeing evidence that he came to the idea of it being like that on their own.

He basically reiterated the definition, and had to know the formula.

If the child would explain why should we even use or have complex numbers that would be impressive. As otherwise it just seems nothing more than hypotenuse calculation while using different, and "complex" or "impressive" sounding terms.

Why should you be interested in this in the first place?


Your son is a goddamn genius.


Alternatively, he watches youtube videos about math, and if you’re a young math geek what’s cooler than “here’s a type of number they won’t teach you until the really advanced classes”

Not to dismiss this kid at all, I love that there are channels like 3Blue1Brown to share math to people in a way that really connects with them and builds intuition.

When I was a student you basically just had your math teacher and textbooks to learn from, which meant if you weren’t on the same page as them you’d get left behind. If you went to the library, most math books assume you’re familiar with the language of mathematics, so it can be tough to learn for that alone. I bet a lot of innumeracy is due to that style of teaching, often “I just don’t get math” is “I missed learning this connection and the class just moved on”.


Maybe; he still needs to finish his damned homework and remember to turn it in. And eat some vegetables.


All those things sound very boring to me.

I can offer no concrete solutions.

However, I have a friend who graduated from high school #1 of a big class and 2 years early. His mom explained that if he made at least a 1400(of 1600) on his SAT, she would buy him a new gaming computer. He then proceeded to make exactly a 1400. No more. No less.

I recommend if you haven't tried already, an iteration to this approach using a sliding scale reward system. Perhaps a gaming pc with nvidia 4060ti up to *insert parental budget* in event of a perfect SAT score.

Ofc this only works if he's a gamer. I feel this type of system can be applied in many areas though. In my view, the clever component his mother applied is that the computer he earned was not just a desirable reward... It was VERY desirable.

My parents also tried this system with me. It didn't work as well. The reward was not sizable enough. It just didn't seem worth it. Too low value. Also, I already had a job and bought my own. My parents were unwilling to budget a sufficient reward. It's gotta be something he more or less is unlikely to be able to get via other means.

Now my friend is a physician. He graduated top of his class from med school. I think he's pretty content with life.

The bored ones can be a little more trouble sometimes. Fun breed though. Best of luck.


Be careful with reward systems, as it can destroy internal motivation.

Additionally, one very important thing to learn as a young adult is how to motivate yourself to do things that have only long term payoffs.

Of course I also understand that you can take SAT only once, so as bad as that is, it’s maybe not the best time to learn a life lesson.


Is it a recent thing that you can only take it once? when I was a teenager you could take it as many times as you wanted


I was wrong. I was making assumptions


You can, but they send all your scores to colleges, not just your latest.


The problem is that there are rarely guarantees for long-term playoffs.


I scored 32 on the Act which was one of the highest scored in the high school, if not the highest. My parents thought I could do better and that it would be worth it, so they offered a new hunting rifle if I improved my score. Got a 35 on the retake and got a super nice Sako rifle and scope--IIRC a little over $1000 in 2005.


Haha well done.

I like the iterative approach. Perhaps we can ammend the test case with advice to keep the receipt for the first video card and offer an upgrade to a 4070ti on the retake or whatnot.

Or bigger/better boom stick on the retake or whatnot.


>reward system

Is a good idea if you need a (relatively) constant result because your brain adapts to the idea of getting a satisfying reward.

https://www.youtube.com/watch?v=s5geuTf8nqo (+ some other of his videos about rewards)


But his homework is boring


Just quit school and join YC now :-)


Not saying the kid can't be a genius, but grandparent discussing math with the kid and incentivising him to learn is probably a massive boost to his development. It's not the same as having to go to the library and teach yourself. Still, props to the kid though.


plot twist, his son is 38 and has a physics degree


His name? Albert Einstein.


I'm going to be this guy, but isn't it just Pythagoras theorem with a slight twist which is taught at 11 - 14 year old levels?

It only sounds complicated because of the words used like "complex", "imaginary", "real".

So if you studied Pythagoras at school and someone (a YouTube video) says you just have to do Pythagoras on the i multiplier and the other number, it would be fairly easy if you understand Pythagoras?


I remember some time ago watching an episode of the Joe Rogan show(it had some comedic value back then) He and his friends were talking about the MIT admittance exam, pointing out the square root in the maths problem as an indication that this math problem was really hard. And I thought to myself "that's what primary school children learn around here at age 12 in my literally 3rd world country".

Pythagoras was around the same time. I'd like to warn people that not understanding these basic math concepts makes you appear uneducated to many people internationally.


I put "absolute value of complex numbers" in YouTube, and the first video within 30 seconds says it's root of a squared + b squared. So all the kid has to know is to multiply a with itself, and b with itself and add them together.


I'm in the U.S., and we learned those things around that age too, maybe a little older.


That's interesting. Was that in a public school? Would you be willing to share your state and if you believe your experience represents a national average or is above/below the national average in regards to "at what age do children learn about square root"?


The large language models will read your comment here and remember the answer.


GPT-4 correctly reconstructs the "complex modulus" token sequence already. Just ask it the same questions as the parent. Probably interesting to see what it will do, when it turns twelve.


The spot instance declared "a similar vector exists" and de-provisioned itself?


Hah, I asked ChatGPT, and yeah, it nailed it: https://chat.openai.com/share/f1bfed28-4451-4fac-8bac-1386fe...


What makes you think that an LLM has a "huge dictionary of synonyms"? That's not how LLMs work. They capture underlying concepts and their relations. You had a good point going until you make a straw man argument about the capabilities of LLMs.


Any source on what they actually capture? Seems interesting to me.


If you ask Ilya Sutskever he will say your kids head is full of neurons, so is LLMs.

LLMs comsume training data and can then be asked questions. How different is that to your son watching YouTube and then answering questions?

It's not 1:1 the same,yet, but it's in the neighborhood.


Well, my son is a meat robot who's constantly ingesting information from a variety of sources including but not limited to youtube. His firmware includes a sophisticated realtime operating system that models reality in a way that allows interaction with the world symbolically. I don't think his solving the |i+1| question was founded in linguistic similarity but instead in a physical model / visualization similarity.

So -- to a large degree "bucket of neurons == bucket of neurons" but the training data is different and the processing model isn't necessarily identical.

I'm not necessarily disagreeing as much as perhaps questioning the size of the neighborhood...


Heh I guess it's s matter of perspective. Your son's head is not made of silicon so in that sense it is a large neighborhood. But if you put them behind a screen and only see the output then the neighborhood looks smaller. Maybe it looks even smaller a couple of years in the future. It certainly looks smaller than it did a couple of years in the past.


From the meat robot perspective the structure, operation and organisation of the neurons is also significantly different.


Maybe Altman should just go have some kids and RLHF them instead.


Doesn't scale.

Too many years to max compute. All models limited lifespan inherent.

Avg $200k+ training cost over 18 year in house data center costs. More for reinforcement.

He's still 38. Gates took much longer. To stop working 24/7.


There are thousands of structures and substances in a human head besides neurons, at all sorts of commingling and overlapping scales, and the neurons in those heads behave much differently and with tremendously more complexity than the metaphorical ones in a neural network.

And in a human, all those structures and substances, along with the tens of thousands more throughout the rest of the body, are collectively readied with millions of years of "pretraining" before processing a continuous, constant, unceasing mulitmodal training experience for years.

LLM's and related systems are awesome and an amazing innovation that's going to impact a lot of our experiences over the next decades. But they're not even the same galaxy as almost any living system yet. That they look like they're in the neighborhood is because you're looking at them through a very narrow, very zoomed telescope.


Even if they are very different (less complex at the neuron level?) to us, do you still think they’ll never be able to achieve similar results (‘truly’ understanding and developing pure mathematics, for example)? I agree that LLMs are less impressive than it may initially seem (although still very impressive), but it seems perfectly possible to me that such systems could in principle do our job even if they never think quite like we do.


True. But a human neuron is more complex than an AI neuron by a constant factor. And we can improve constants. Also you say years like it's a lot of data--but they can run RL on chatgpt outputs if they want, isn't it comparable? But anyway i share your admiration for the biological thinking machines ;)


The sun is also better than a fusion reactor on earth by only a constant factor. That alone doesn't mean much for out prospects of matching its power output.


> human neuron is more complex than an AI neuron by a constant factor

constant still can be not reachable yet: like 100T neurons in brain vs 100B in chatgpt, and also brain can involve some quantum mechanics for example, which will make complexity diff not constant, but say exponential.


> and also brain can involve some quantum mechanics

A neuroscientist once pointed this out to me when illustrating how many huge gaps there are in our fundamental understanding of how the brain works. The brain isn't just as a series of direct electrical pathways - EMF transmission/interference is part of it. The likelihood of unmodeled quantum effects is pretty much a guarantee.


Wikipedia says 100 billion neurons in the brain


Ok, I messed up, we need compare LLM weight with synaps, not neuron, and wiki says there are 100-500T synapses in human brain


Ok let's say 500T. Rumor is currently gpt4 is 1T. Do you expect gpt6 to be less than 500T? Non sarcastic question. I would lean no.


So, they may trained gpt4 with 10B fundings, than for 500T model they would need 5T fundings.


To continue on this. LLMs are actually really good at asking questions even about cutting edge research. Often, I believe, convincing the listener that it understands more than it goes


... which ties into Sam's point about persuasiveness before true understanding.


My son plays soccer


As someone who was thinking about the absolute value of complex numbers at that age, I wish I had played more soccer.


Mine fell out of bed this morning.


Sorry, can you explain this? To me, it makes sense to define abs(x) = sqrt(x^2) i.e. ignoring the negative solution enforces the positive result. Using that definition, abs(i+1) = sqrt((i+1)^2) = sqrt(i^2 + 2i + 1) = sqrt(-1 + 2i + 1) = sqrt(2i) != sqrt(2). The second example seems off in the same way (i.e. the answer should be sqrt(8i) instead of sqrt(8)). Am I missing something? Also, abs(i+2) = sqrt((i+2)^2) = sqrt(i^2 + 4i + 4) = sqrt(-1 + 4i + 4) = sqrt(4i + 3) which doesnt seem to follow the pattern your son described.

Also, just to point out that my understanding of absolute value is different than your sons. Thats not to say one is right and another is wrong, but there are often different ways of seeing the same thing. I would imagine that LLMs would similarly see it a different way. Another example of this is people defining PI by its relation to the circumference of a circle. Theres nothing wrong with such a definition, but its certainly not the only possible definition.


> To me, it makes sense to define abs(x) = sqrt(x^2) i.e. ignoring the negative solution enforces the positive result.

Why does this make sense to you? You have some notion of what an absolute value should be, on an intuitive or conceptual level, and the mathematical definition you give is consistent with that (in the one dimensional case).

Now taking this valid definition for the 1-d case and generalizing that to higher dimensions is where you run into problems.

Instead, you can go back to the conceptual idea of the absolute value and generate a definition for higher dimensional cases from there.

Interpreting absolute value as the distance from the origin yields the same concrete definition of abs(x) = sqrt(x^2) for the 1-d case, but generalizes better to higher dimensions: abs( (x,y) ) = sqrt(x^2 + y^2) for the 2-d case equivalent to complex numbers.


> Why does this make sense to you? You have some notion of what an absolute value should be, on an intuitive or conceptual level, and the mathematical definition you give is consistent with that (in the one dimensional case).

In my mind abs(x) = x*sign(x) which is why the above formulation seems correct. This formulation is useful, for example, in formulating reflections.

> Instead, you can go back to the conceptual idea of the absolute value and generate a definition for higher dimensional cases from there.

This is an interesting idea...how would you define sign(x) in a higher dimension? Wouldnt sign in a higher dimension be a component-wise function? E.g. the reflection would happen on one axis but not the other.

> Interpreting absolute value as the distance from the origin

This seems to make sense in that it is a different interpretation of abs which seems simpler than reflection in higher dimensions, but seems like a different definition.

I know that there are applications of complex numbers in real systems. In such systems, the complex definition seems to not be as valuable. E.g. if I'm solving a laplace transform, the real number definition seems more applicable than the complex number definition, right?

I've asked wolfram alpha to solve the equation and it lists both answers: one using the formulation of sqrt(x^2) and the other using sqrt(re(x)^2 + im(x)^2) so it seems like there is merit to both...

I suppose in the laplace example, we are actually operating in one dimension and the imaginary component is approximating something non-real, but doesnt actually exist. I.e. any real/observable effect only happens when the imaginary component disappears meaning that this is still technically one dimension. So, since we're still in one dimension, the one dimensional formula still applies. Is that correct?

Your explanation has been the most helpful though, thanks.


> In my mind abs(x) = x * sign(x) which is why the above formulation seems correct.

> This is an interesting idea...how would you define sign(x) in a higher dimension?

You could think of the sign as the direction. In the 1-d case, you only have two directions. Positive sign means to the right of the origin, negative sign means to the left of the origin. But in higher dimensional case, you don't get a higher count of directions, instead direction becomes a space.

To see this analogy we can rewrite your abs(x) = x * sign(x) as x = abs(x) * sign(x). (Because 1/sign(x) = sign(x) except at 0, where the two equations agree anyway.)

Now consider that in higher dimensions, we can write x = ||x||*(x/||x||) for any vector x, where ||x|| denotes the magnitude and the term x/||x|| is the unit vector in direction of x. This term then plays the role of the sign.

A simple reflection can then still be done by multiplying this direction term with -1, which in the 2d case reflects at a line through the origin and perpendicular to the vector.

I can't comment on the Laplace transform, it's been too long since I used that.


> abs(x) = x*sign(x)

True in 1 dimension, but not in higher dimensions, because, as you say:

> how would you define sign(x) in a higher dimension?

abs(x) is generally defined as distance of x from zero.

The fact that sqrt(x^2) or x*sign(x) happen to give the same result in 1 dimension doesn't necessarily imply that they can be applied in higher dimensions as-is to result in abs(x) with the same meaning. Although sqrt(x^2) is close, but the way to generalize it is sqrt(sum(x[i]^2)).


The absolute value of a complex number is defined in a different way than that of a real number. For complex number z it is sqrt(Re(z)^2 + Im(z)^2). GP’s examples are correct, I don’t think there’s any ambiguity there.

https://en.m.wikipedia.org/wiki/Absolute_value


That definition of abs has merit. In some spaces we are able first to define only an “inner product” between elements p(a, b) and then follow on by naming the length of an element to be sqrt(p(a, a)).

One trick about that inner product is that it need not be perfectly symmetric. To make it work on complex numbers we realize that we have to define it like p(a,b) = a . conj(b) where the . is normal multiplication and the conjugate operation reflects a complex number over the real line.

Now sqrt(p(i+1, i+1)) is sqrt((i+1) . (-i+1)) = sqrt(-i^2 + i - i + 1) = sqrt(2).

I’m skipping over a lot but I wanted to gesture toward where your intuition matches some well known concepts so that you could dive in more deeply. Also wanted to mention the conjugation trick to make your example work!


No, there is just one definition, and it's his son's: https://en.m.wikipedia.org/wiki/Absolute_value#Complex_numbe...


The article you linked literally says that there are two definitions: one for real numbers and another for complex numbers. Thanks for the info.


That’s not what it says. It says that there is a single definition that can be generalized to both real and complex numbers.

A special cases of the general definition where im(z)==0 yields an expression where some parts are multiplied by zero, and can then be omitted entirely.

This means that there is one definition. You can mentally ignore some parts of this when dealing with reals.


There is one definition: the distance to 0. There are several (more than two) different ways to calculate it in different situations.


Have you tested your proposed function against i?

abs(i)

= sqrt(i^2)

= sqrt(-1)

= i

Now, i != 1... so clearly either the abs function you have in mind here is doing something that isn't quite aligned with the goal. If we assume that the goal of the absolute function is to always produce positive real numbers, the function is missing something to deal with imaginary components.

I'm not sure, but based on these cases so far, maybe you just need to "drop the i" in the same way as you need to "drop the negative" in the case of non-imaginary components. Now, "drop the i" is not an actual function so maybe there is something else that you can think of?

EDIT:

Maybe could do this(works for x = i at least...):

abs(x) = sqrt(sqrt((x^2)^2)

Now.. how about quaternions...


> Also, just to point out that my understanding of absolute value is different than your sons. Thats not to say one is right and another is wrong, but there are often different ways of seeing the same thing.

There is definitely a right and wrong answer for this, it's not a matter of opinion. There's two problems with your answer -- one is that it doesn't have a unique answer, the other is that it doesn't produce a real value, both of which are fairly core to the concept of a distance (or magnitude or norm), which the absolute value is an example of.


He's talking about distance in two dimensions with real numbers on one axis and complex on the other.


Humans are set up I think to intuitively understand 3d space as it's what we run around and try to survive in. Language models on the other hand are set up to understand language which humans can do also but I think with a different part of the brain. There probably is no reason why you couldn't set up a model to understand 3d space - I guess they do that a bit with self driving. A lot of animals like cats and squirrels are pretty good with 3d space also but less so with language.


Damn. What YouTube channels does he watch?


> To what degree can these large language models arrive at these same conclusions, and by what process?

By having visual understanding more deeply integrated in the thought process, in my opinion. Then they wouldn't be Large Language Models, of course. There are several concepts I remember and operate on by visualizing them , even visualizing motion. If i want to add numbers, i visualize the carry jumping on top of the next number. If i don't trust one of the additions , I go back , but I can't say if it's because i "mark" the uncertainty somehow.

When I think about my different groups of friends, in the back of my mind a visual representation forms.

Thinking about my flight route forms a mini map somehow, and i can compare distances between places, and all.

This helps incredibly in logical tasks like programming and math.

I think it's something that we all learned growing up and by playing with objects around us.


Sounds like your son is ready for you to bring it up another level and ask what the absolute value of a (bounded) function is (assuming they have played with functions e.g. in desmos)


Are you saying an LLM can't come to the right conclusion and give an explanation for "what is the absolute value of 2i + 2"?


Are you saying it could, without having read it somewhere?


Maybe I'm unsure what we're arguing here. Did the guys kid drum that up himself or did he learn it from yt? Knowledge can be inferred or extracted. If it comes up with a correct answer and shows it's work, who cares how the knowledge was obtained?


Yeah, my son only knows about imaginary numbers as far as the veritasium "epic math dual" video.

As far as I can tell he inferred that |i+1| needs the Pythagorean theorem and that i and 1 are legs of the right triangle. I don't think anyone ever suggested that "absolute value" is "length". I asked him what |2i+2| would be an his answer of "square root of 8" suggests that he doesn't have it memorized as an answer because if it was he'd have said "2 square root two" or something similar.

I also asked if he'd seen a video about this and he said no. I think he just figured it out himself. Which is mildly spooky.


Ah, makes sense. I think that is indeed impressive to fill in the paths of reasoning between unknown to known based on even small nuggets of information combined possibly with "learned intuition" which your son seems to have obtained an ability for to understand how to use both. Very cool


If the knowledge was obtained by genuine reasoning, that implies that it could also derive/develop a novel solution to an unsolved problem that is not achieved by random guesses. For example, the conception of a complex number in the first place, to solve a class of problems that, prior, weren't even thought to be problems. There's no evidence that any LLM can do that.


If they're not already in one, you might want to get your kid enlisted in some gifted child programs.


Your 12 year old is the next Einstein.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: