Do they explain anywhere why we consider radicals special and ask these questions about them in the first place? Nobody ever explains that. To my programmer brain, whether I'm solving t^3 = 2 or t^100 + t + 1 = 7 numerically, I have to use an iterative method like Newton's either way. ("But there's a button for Nth root" isn't an argument here - they could've added a more generic button for Newton.) They don't fundamentally seem any different. Why do I care whether they're solvable in terms of radicals? It almost feels as arbitrary as asking whether something is solvable without using the digit 2. I would've thought I'd be more interested in whether they're solvable in some other respect (via iteration, via Newton, with quadratic convergence, or whatever)?
> To my programmer brain, whether I'm solving t^3 = 2 or t^100 + t + 1 = 7 numerically
Typically when you're solving a polynomial equation in an applied context (programming, engineering, physics, whatever), it's because you have modeled a situation as a polynomial, and the information you really want is squirreled away as the roots of that polynomial. You don't actually care about the exact answer. You've only got k bits of precision anyway, so Newton's method is fine.
But we're not interested in the solution. We don't particularly care that t = approx. 1.016 is a numerical solution.[0] We're not using polynomials to model a situation. In mathematics, we are often studying polynomials as objects in and of themselves, in which case the kind of roots we get tells us something about polynomials work, or we are using polynomials as a lens through which to study something else. In either case it's less about a specific solution, and more about what kind of solution it is and how we got it.
Not to mention, specific polynomials are examples. Instead of t^100 + t + 1 = 7, we're usually looking at something more abstract like at^100 + bt + c.
[0] And in the rare case we actually care about a specific root of a specific polynomial, and approximate numerical solution is often not good enough.
Radicals are a "natural" extension in a certain sense, just as subtraction and division are. They invert an algebraic operation we often encounter when trying to solve equations handed to us by e.g. physics. I find it understandable to want to give them a name.
Why not something like "those things we can solve with Newton"? As you note Newton is broadly applicable; one would hope, given how popular the need to invert an exponent is, that something better (faster, more stable) if more specific than Newton might be created. It is hard to study a desired hypothetical operation without giving it a name.
On a related note, how come we don't all already have the names of the 4th order iterative operation (iterated exponents) and its inverse in our heads? Don't they deserve consideration? Perhaps, but nature doesn't seem to hand us instances of those operations very often. We seemingly don't need them to build a bridge or solve some other common practical engineering problem. I imagine that is why they fail to appear in high school algebra courses.
Lots of good answers here already, but I can also add my own perspective. Fundamentally, you're right that there isn't really anything "special" about radicals. The reason I personally find the unsolvability of the quintic by radicals interesting is that you can solve quadratics, cubics, and quatrics (that is, polynomial equations of degree 2, 3, and 4) by radicals.
To say the same thing another way: the quadratic formula that you learned in high school has been known in some form for millennia, and in particular you can reduce the question of solving quadratics to the question of finding square roots. So (provided you find solving polynomial equations to be an interesting question) it's fairly natural to ask whether there's an analogous formula for cubics. And it turns out there is! You need both cube roots and square roots, and the formula is longer and uglier, but that's probably not surprising.
Whether you think n'th roots are "intrinsically" more interesting than general polynomial equations or not, this is still a pretty striking pattern, and one might naturally be curious about whether it continues for higher degrees. And I don't know anyone whose first guess would have been "yes, but only one more time, and then for degree 5 and higher it's suddenly impossible"!
They aren't really special except that adjoining the solutions to a radical to a field make the associated group of automorphisms simplify in a nice way. Also at the time tables of radical roots where common technology and Newton's method was not. But fundamentally, we can define and use new functions to solve things pretty easily; if you get down to it cos() and sin() are an example of this happening. All those applied maths weird functions (Bessel, Gamma, etc) are also this. As well, the reason not to just use numeric solutions for everything is that there are structural or dynamic properties of the solutions of equations that you won't be able to understand from a purely numeric perspective.
I think taking a specific un-solvable numeric equation and deriving useful qualitative characteristics is a useful thing to try. You have cool simple results like Lyapunov stability criterion or signs of the eigenvalues around a singularity, and can numerically determine that a system of equations will have such and such long term behavior (or the tests can be inconclusive because the numerical values are just on the threshold between different behaviors.
That's one of the really fun things about taking Galois theory class - you get general results for "all quintics" or "all quadratics" but also you can take specific polynomials and (sometimes) get concrete results (solvable by radicals, but also complex vs. real roots, etc.).
I'd say one of the fundamental lessons of field and Galois theory is that they're not intrinsically special. They're just easier and more appealing to write down. (For the most part, anyway. They're a little bit special in some subspecialties for some technical reasons that are hard to explain.)
One reason to focus on them: when you tell students that x^5 - x - 1 = 0 can't be solved by radicals, that "even if God told you the answer, you would have no way to write it down", this is easy to understand and a powerful motivator for the theory. It's a nice application which is not fundamental, but which definitely shows that the theory has legs.
If you want to know which polynomials are solvable by Newton's method? All of them. It illustrates that Newton's method is extremely useful, but the answer itself is not exactly interesting.
> "even if God told you the answer, you would have no way to write it down"
But isn't this also true for generic quadratics/cubics too? Like the solution to x^3-2=0 is cubed_root_of(2), so it seems we can "write it down". But what is the definition of cubed_root_of(2)? Well, it's the positive solution to x^3-2=0...
When I say that a fundamental lesson of field theory is that radicals are not really special, this is what I mean. You are thinking in a more sophisticated way than most newcomers to the subject.
I feel there is an interesting follow-up problem here. The polynomials x^n+a=0 are used to define the "radicals" which is a family of functions F_n such that F_n(a) = real nth root of a = real solution of x^n + a = 0. Using these radicals you can solve all quadratics, cubics and quintics.
Now take another collection of unsolvable polynomials; your example was x^5 - x - 1 = 0 and maybe parameterize that in some way such that these polynomials are unsolvable. This gives us another family of functions G_n. What if we allow the G_n's to be used in our solutions? Can we solve all quintics this way (for example)?
I'm not sure I understand your question exactly, but I am fairly certain that not all quintic fields can be solved by the combination of (1) radicals, i.e. taking roots of x^n - 1, and (2) taking roots of x^5 - x - 1. I don't have a proof in mind at the moment, but I speculate it's not too terribly difficult to prove.
If I'm correct, then the proof would almost certainly use Galois theory!
He is not wholly wrong, while not solvable in the general case by normal radicals, there is a family of functions, a special "radical" as he said, the Bring radical that solves the quintic generally. Of course as said its not a 5th root, but the solution to a certain family of quintics.
You are right in the sense that solvability by radicals has no practical importance, especially when it comes to calculations.
It is just a very classical pure math question, dating back hundreds of years ago. Its solution led to the development of group theory and Galois theory.
Group theory and Galois theory then are foundational in all kinds of areas.
Anyway, so why care about solvability by radicals? To me the only real reason is that it's an interesting and a natural question in mathematics. Is there a general formula to solve polynomials, like the quadratic formula? The answer is no - why? When can we solve a polynomial in radicals and how?
And so on. If you like pure math, you might find solvability by radicals interesting. It's also a good starting point and motivation for learning Galois theory.
> Do they explain anywhere why we consider radicals special and ask these questions about them in the first place? Nobody ever explains that. To my programmer brain,
But regardless of whether you're a programmer, mathematician, machinist, carpenter, or just a kid playing with legos, there's always a good time to be had in the following way: first you look at the most complex problems that you can manage to solve with simple tools; then you ask if your simple tools are indeed the simplest; and then if multiple roughly equivalently simple things are looking tied in this game you've invented then now you get the joy of endlessly arguing about what is most "natural" or "beautiful" or what "simple" even means really. Even when this game seems pretty dumb and arbitrary, you're probably learning a lot, because even when you can't yet define what you mean by "simple" or "natural" or "pretty" it's often still a useful search heuristic.
What can you do with a lot of time and a compass and a ruler? Yes but do we need the "rule" part or only the straight-edge? What can we make with only SKI combinators? Yes but how awkward, I rather prefer BCKW. Who's up for a round of code-golf with weird tiny esolangs? Can we make a zero instruction-set computer? What's the smallest number of tools required to put an engine together? Yes but is it theoretically possible that we might need a different number of different tools to take one apart? Sure but does that really count as a separate tool? And so it goes.. aren't we having fun yet??
The prime importance of solving by radicals is actually, that it led to the theory of groups! Groups are used in all sorts of places. (One nice pictorial example is fundamental group of a topological space). Just like Complex numbers arose in trying to solve the cubic. Also, the statement of Fermat's Last Theorem doesn't have any applications but its solution led to lot of interesting theory like how ideals get factorized in rings, elliptic curves, Galois representations...
BTW, the same theory can be extended to differential equations and Differential Galois Theory tells you if you can get a solution by composing basic functions along with exponentials.
Historically, radicals can be motivated by looking at people trying to solve linear, then quadratics, medieval duels about cubics and quartics, the futile search for solving quintics etc. Incidentally, quintics and any degree can have a closed form solution using modular functions.
> To my programmer brain, whether I'm solving t^3 = 2 or t^100 + t + 1 = 7 numerically, I have to use an iterative method like Newton's either way.
Doesn’t your programmer brain want things to run as fast as possible?
If you have another weapon in your arsenal for solving polynomial equations, you have an extra option for improving performance. As a trivial example, you don’t call your Newton solver to solve a linear equation, as the function call overhead would mean giving up lots of performance.
Also, if you solve an equation not because you want to know the roots of the equation but because you want to know whether they’re different, the numerical approach may be much harder than the analytical one.
> I would've thought I'd be more interested in whether they're solvable in some other respect (via iteration, via Newton, with quadratic convergence)
That’s fine. For that, you read up on the theory behind various iterative methods.
In a real problem you can have equations with parameters, where coefficients depend on some other parameters. And you should be able to answer questions about the roots depending on parameter values, like 'in which range of parameters p you have real (non-complex) roots'? Having a formula for the roots based on coefficients lets you answer such questions easily. For example, for quadratic equation ax^2 +bx + c = 0 we have D = b^2 - 4ac, and if D < 0 then there is no real (non-complex) roots.
I imagine it has something to do with the fact that there is a somewhat simple pen-and-paper procedure, quite similar to long division, for calculating square/cubic/etc. roots to arbitrary precision, digit by digit. So finding roots is, in a sense, an arithmetic operation; and of calculus one better not speak in a polite society.
Superb question! Interestingly, my math professor asked the exact same question after teaching Galois theory and stated that he does not have a good answer himself. Let me try to give sort of an answer. :)
We have fingers. These we can count. This is why we are interested in counting. This is gives us the natural numbers and why we are interested in them. What can we do with natural numbers? Well the basic axioms allow only one thing: Increment them.
Now, it is a natural question to ask what happens when we increment repeatedly. This leads to addition of natural numbers. The next question is to ask is whether we can undo addition. This leads to subtraction. Next, we ask whether all natural numbers can be subtracted. The answer is no. Can we extend the natural numbers such that this is possible? Yes, and in come the integers.
Now, that we have addition. We can ask whether we can repeat it. This leads to multiplication with a natural number. Next, we ask whether we can undo it and get division and rational numbers. We can also ask whether multiplication makes sense when both operands are non-natural.
Now, that we have multiplication, we can ask whether we can repeat it. This gives us the raising to the power of a natural number. Can we undo this? This gives radicals. Can we take the root of any rational number? No, and in come rational field extensions including the complex numbers.
A different train of thought asks what we can do with mixing multiplication and addition. An infinite number of these operations seems strange, so let's just ask what happens when we have finite number. It turns out, no matter how you combine multiplication and addition, you can always rearrange them to get a polynomial. Formulated differently: Every branch-free and loop-free finite program is a polynomial (when disregarding numeric stability). This view as a program is what motivates the study of polynomials.
Now, that we have polynomials, we can ask whether we can undo them. This motivates looking at roots of polynomials.
Now, we have radicals and roots of polynomials. Both motivated independently. It is natural to ask whether both trains of thought lead to the same mathematical object. Galois theory answers this and says no.
This is a somewhat surprising result, because up to now, no matter in which order we asked the questions: Can we repeat? Can we undo? How to enable undo by extension? We always ended up with the same mathematical object. Here this is not the case. This is why the result of Galois theory is so surprising to some.
Slightly off-topic but equally interesting is the question about what happens when we allow loops in our programs with multiplication and addition? i.e. we ask what happens when we mix an infinite number of addition and multiplication. Well, this is somewhat harder to formalize but a natural way to look at it is to say that we have some variable, in the programming sense, that we track in each loop iteration. The values that this variable takes forms a sequence. Now, the question is what will this variable end up being when we iterate very often. This leads to the concept of limit of a sequence.
Sidenote: You can look at the usual mathematical limit notation as a program. The limit sign is the while-condition of the loop and the part that describes the sequence is the body of the loop.
Now that we have limits and rational numbers, we can ask how to extend the rational numbers such that every rational sequence has a limit. This gives us the real numbers.
Now we can ask the question of undoing the limit operation. Here the question is what undoing here actually means. One way to look at it is whether you can find for every limit, i.e., every real number, a multiply-add-loop-program that describes the sequence whose limit was taken. The answer turns out to be no. There is a countable infinite number of programs but uncountably infinite many real numbers. There are way more real numbers than programs. In my opinion this is a way stranger result than that of Galois theory. It turns out, that nearly no real number can be described by a program, or even more generally any textual description. For this reason, in my opinion, real numbers are the strangest construct in all of mathematics.
I hope you found my rambling interesting. I just love to talk about this sort of stuff. :)
Thanks so much, I feel like you're the only one who grasps the crux of my question!
> This is a somewhat surprising result, because up to now, no matter in which order we asked the questions: Can we repeat? Can we undo? How to enable undo by extension? We always ended up with the same mathematical object.
I think this is the bit I'm confused on - we have an operation that is a mixture of two operations, where previously we only looked at pure compositions of operations (let's call this "impure"). Why is it surprising that the inversion of an "impure" operation produces an "impure" value?
It's like saying, if I add x to itself a bunch, I always get a multiple of x. If I do the same thing with y, then I get a multiple of y. But if I add x and y to each other, I might get a prime number! Is that surprising? Mixing is a fundamentally new kind of operation; why would you expect its inversion to be familiar?
What is surprising and what not is always very subjective.
Now that I think more about it, one could argue that everything you can do with inverting radicals can also be done by inverting polynomials. So You could look at radicals as the step after multiplication and at inverting polynomials as the step after radicals. With this may depiction that these are two competing extensions falls apart a bit.
My chain of argumentation was that one could expect that there is a single natural ever growing set of "numbers" starting with the natural numbers, then integers, then rational numbers, then real numbers, culminating in the real complex numbers and ever set is a superset of the previous one. This is the "natural" order in which they are taught in school and somewhat mirrors how they historically were discovered. In retrospect, this is obviously not true. Just look at the existence of rational complex numbers. However, when all you have are natural, integer and rational numbers, it seems like it could be true.
Let me try a different way of explaining why it is surprising to some.
In school, I learned that I can solve quadratic equations by combining the inverse operations of the basic operations that make up the quadratic polynom. This seems natural as it worked for solving the linear equations I had seen so far. Inverse of combination is combination of inverse. At some point the teacher showed the formula for degree three. Cubic radicals appeared. We were overwhelmed by it's size but the basic operations used matched what we expected. The teacher said that degree 4 is even drastically larger with degree 4 radicals and we definitely do not want to see that, which is true. Nothing was said about degree 5 but it felt like it was implied that the pattern continues and the main problem with degree 5 is that our brains are just not able to handle the amount of operations that make up the formula.
Fast forward to university. Now the professor proves in the Galois theory course that, no, it's not that you are too stupid to handle degree 5. It's just that degree 5 cannot be handled this way at all. I am still unsure about whether my teacher in school knew that degree 5 is impossible or just assumed that he too is just too stupid.
I guess this mathematicians must have felt something similar back then. You learn about linear equations. All is easy and works. You learn about quadratics. After mixing in quadratic radicals, all is well again. You try to grasp cubics, and yes, with a lot of work this too can be learned. You think about quartics and after lots and lots of time come to the conclusion that yes it is possible but impossible to master the formula. It feels like the pattern should continue and the reason you don't have a quintic formula with degree 5 radicals is not because it does not exist but because of it's sheer size and just stating it would fill a whole book. Turns out, there is no such book.
Suppose you are a renowned mathematician back then who has failed for years to find a quintic formula. Now this teenager named Évariste comes along and fails too but says that it's not because he's too stupid but because it's impossible. At first, this does sound like an excuse of a lazy student, doesn't it?
Let's say you are not surprised that roots of degree 5 polynoms cannot be computed using just addition, subtraction, multiplication, division, and radicals. Does it surprise you that degree 4 polynoms can? Why does this work for degree 2, 3 and 4 yet fails for 5 and higher? I can see that one can argue that there is no reason to assume that it always works. However, at least learning the fact that it starts failing at degree 5 should be non-intuitive.
You could think of it as a proof saying which polynomials are solvable by which algorithms. Solvable by radicals is one class of simpler algorithm and it so happens we have a cute proof as to when it will work or fail.
I think these are two different questions:
- Why care about radicals?
- Why try to solve polynomial equations in terms of radicals?
For the first question:
Taking Nth powers is a fairly basic operation, which occurs all the time in mathematics.
Taking Nth roots is simply the inverse operation, so it is fairly natural to be interested in it/having to deal with it.
For the second question:
Let’s pretend for a moment that we didn’t know how the quadratic formula looked like.
Could we nevertheless say anything about it?
The quadratic formula is supposed to give us the solutions to the equation a x^2 + b x + c = 0.
A special case of this general quadratic equation is x^2 - p = 0.
There are two ways of solving this specialized equation:
either by taking a square root, giving us the two solutions ±√p, or by using the general quadratic formula (with a = 1, b = 0, c = -p).
Both of these approaches need to give us the same results, since they are both correct.
This tells us that if we simplify the quadratic formula with a = 1, b = 0, c = -p, then a square root needs to appear.
How can this happen?
Well, the most basic guess is that the quadratic formula contained at least one square root to begin with.
Looking at the actual quadratic formula tells us that this guess is correct:
the formula uses the four basic arithmetic operations (addition, subtraction, multiplication, division) and a square root.
We can repeat the same thought experiment for cubic equations, and we find that the cubic formula should probably contain third roots.
Looking up the formula confirms this suspicion.
However, it should be noted that the cubic equation does not only contain third roots, but also square roots.
The situation for the quartic equation is similar:
we suspect that the quartic formula contains fourth roots.
And thanks to our experience with the cubic formula, we may also suspect that the quartic formula contains third roots and square roots.
Looking up the formula, we see that it contains both third roots and square roots, but not (directly) any fourth roots.
(Our original idea breaks down a bit because fourth roots can be expressed as iterated square roots.
This makes it possible that the general quartic formula does not contain fourth roots, even though its simplified version will contain them.)
So what about a general polynomial equations of degree N >= 5?
Our original observation tells us that a solution formula needs to contain some sort of operation(s) that, when the formula is applied to certain special cases, gives us Nth roots.
Just as before, the most basic guess is that the formula will contain Kth roots, and the previous examples suggest that one should expect K = 2, ..., N to occur.
Summary:
To find a formula for polynomials equations of degree N >= 2, we are forced to use additional operations apart from the four basic arithmetic operations.
In certain special cases, these additional operations need to simplify to roots.
This suggests using roots in the formula, and the cases N = 2, 3, 4 support this idea.
Heuristically speaking, we are not trying to use roots because we want to, but because they seem to be the bare minimum required to even hope of finding a formula.