Most of the world thinks of mathematics as having only right and wrong answers, with little room for creativity. However, this perception is inconsistent with the way that mathematics is actually done. Like in most other subjects, there is lots of guesswork and speculation, and opportunities for wishful thinking and optimism to shine are manifold. I’d like to give an extended example of how wishful thinking can lead to great mathematics. I do not claim that this is an accurate historical account, although I wouldn’t be too surprised were I to find out that it’s not far from the truth.
We’re accustomed to the following situations and their corresponding pictures. Two lines meet at a point (see Figure 1), a line and a circle or parabola meet at two points (Figures 2 and 3), and two ellipses meet at four points (Figure 4). These are all special cases of a general phenomenon, known as Bézout’s Theorem. Let’s discuss what it says.
We can write an equation of a line in the form ax+by+c=0, for suitable choices of a, b, and c. We can write an equation of a circle in the form x^2+y^2+ax+by+c=0. We can write an equation of a parabola in the form ax^2+bxy+cy^2+dx+ey+f=0 (assuming certain constraints on a, b, c, d, e, and f). An ellipse is similar. Each of these equations is a sum of a bunch of monomials, which are some coefficients multiplied by some powers of x and some powers of y, such as 5x^2y^7. Let’s call the degree of a monomial the sum of the exponent of x and the exponent of y, so it would be 9 in the case of 5x^2y^7. Associated to a curve with a certain polynomial equation, call its degree the maximum of the degrees of the monomials in its equation. Hence, the degree of a line is 1, and the degree of a circle, parabola, or ellipse is 2.
Bézout’s Theorem says that if we have two curves, of degrees d and e, then they should intersect in d*e points.
This is true in all the cases we saw above. However, it is not difficult to come up with counterexamples. For example, we can have two lines that do not intersect at all (parallel lines): (Figure 5). We can have a circle and a line intersecting at just one point, or perhaps not at all. (Figures 6 and 7) Similar sorts of problems can frequently occur, but if we draw many such examples, we quickly discover something interesting: a degree d curve and a degree e curve intersect in at most d*e points. This suggests that, perhaps, there are extra points somewhere that we need to be counting; they just haven’t shown up yet as examples of points we already understand.
Well, actually even that’s not quite right. Suppose we try to intersect a line with itself. They intersect at every point on that line, which is infinitely many of them.
So far, this theorem isn’t looking terribly promising. We have an array of counterexamples. However, the theorem sounds really good, and it appears to hold “under nice circumstances”; we might have a vague intuitive understanding of when it will and won’t hold from looking at a few pictures. And, of course, mathematicians tend to be stubborn people; when we’ve hit upon a nice result, even one that’s false, we’re going to try really hard to rescue it in any way we can. And it’s even better when the rescue attempts informs new areas of mathematics.
So, with that in mind, let’s try to start patching up our counterexamples. Let’s begin with the easiest one to fix: what if we have something like a parabola and a line not intersecting at all? Let’s take a close look at an example. x^2-y=0 is a parabola, and y=-1 is a line. They don’t intersect at any points (Figure 8), but we want them to intersect twice. Generally, if we want to find the coordinates of the points of intersection, we try to solve the equations of the two curves simultaneously. That is, we want to know which values of x and y satisfy x^2-y=0 and y=-1. Substituting the second equation into the first gives x^2+1=0. So, the problem is that there aren’t any numbers with x^2+1=0. Or are there?
Of course there are. And if you don’t think there are any, then make up new numbers, which I’ll call i and -i, that do satisfy i^2+1=0 and (-i)^2+1=0. These are complex numbers. So, the first correction we make is to insist that we’re looking for solutions in the complex numbers, not just in the real numbers. (Or, for those of you fond of generality, we at least need to work over some algebraically closed field.) Unfortunately, since the complex numbers are 2-dimensional, and we have two complex coordinates, we need to imagine a 4-dimensional space. That’s too hard for me to draw or even visualize. So I’ll just stick to drawings done over the real numbers. Just remember there’s more to the correct picture than is actually drawn.
So, that problem wasn’t too tough to fix: it’s just the Fundamental Theorem of Algebra . Let’s look at a more difficult problem: that of parallel lines. Suppose we have two parallel lines, such as x+y=0 and x+y+1=0. (Figure 9) If we try to solve these simultaneously, we get 1=0. That’s just a blatant falsehood, so we need to do better.
As a thought experiment, let’s not consider lines that are actually parallel; let’s just consider lines that are “almost” parallel. (Figure 10) They do intersect, but they intersect very far away. As the lines get closer and closer to being parallel, the intersection moves further away, so we might be inspired to say that the parallel lines actually do meet, but only “at infinity.”
That’s all very well for intuition, but it’s not math yet. Let’s try to fix that. We want at least one extra point, and possibly many, at infinity. Let’s try just having one point at infinity. That means we should have a point very far away which connects all the faraway points on the plane. Imagine taking the plane and bringing the edges up so that they meet at a point. If we then smooth it out a bit, we end up with a sphere. We also want all our lines pass through the extra point at infinity (which is the north pole of the sphere), so they become some sort of shape (which turn out to be circles, at least if we do the smoothing correctly) passing through the north pole.
Let’s check if we might be on the right track. If we start with two lines which were parallel on the plane, they now intersect only at the north pole (Figure 11). So far, so good. But, if we start with two lines that weren’t parallel to begin with, then they would have one point of intersection in the plane, and they’d intersect again at the north pole on the sphere. So they’d intersect twice. That doesn’t seem right, and it doesn’t match the conclusion of Bézout’s Theorem. So, this idea of having one extra point at infinity is not quite right yet.
However, if we step back a little, or maybe a lot, we can still find inspiration in the sphere. Let’s look at only the great circles on the sphere. Those are the ones with maximal radius, those whose center is the center of the sphere. Two of these great circles always intersect in exactly two points (unless they’re the same circle). Furthermore, the two points in which they intersect are diametrically opposite (or antipodal). (Figure 12) A good stroke of insight may lead us to think of a pair of antipodal points as being a single point and a great circle as being a line. Under these strange definitions, though, two lines always meet at exactly one point. This is very promising!
Is it possible to create some sort of a space that only has one point for every antipodal pair of points on the sphere? Sort of. I mean, yes, but you can’t put it into 3-dimensional space in a satisfactory manner. You can, however, put it into 4-dimensional space. It’s called a projective plane. Here’s a way you can almost imagine it in 3-dimensional space. Cut a sphere in half, and consider only one hemisphere. For most pairs of antipodal points on the original sphere, there’s now only one point on the hemisphere. But there’s a boundary circle of the hemisphere, and for every antipodal pair of points on the sphere lying on that boundary circle, we still have two points. So, we have to glue them together. Sadly, it must pass through itself if you want to glue them together in 3-dimensional space, but that’s not necessary in 4-dimensional space. (Figure 13) For various reasons related to many areas of mathematics, I like to think of this space as being a plane plus a line plus a point.
Now, we might have a rough idea for how to picture a projective plane and lines on it, but we don’t yet know how to picture other curves on it, and we certainly don’t know how to write solve equations on it. And that’s because we’re not thinking of the projective plane in quite the right way yet. The right way of thinking about it is to consider the projective plane as lines through the origin in three-dimensional space. (Where does this come from? The opposite points on the sphere are exactly those that lie on the same line through the origin. Now we’re looking at the whole line rather than just the two points because it’s much easier to work with because it will turn out to be much easier to write down equations.) So, we don’t consider every point on these lines; we just consider each of the lines once. But, since we tend to find it easier to solve for points than to solve for lines, we’ll work with the points on these lines. Hence, we’ll represent a point in the projective plane by a triple (x,y,z), where not all of x, y, and z are zero. (Why shouldn’t they all be zero? If they are, then this point is on all the lines, so we don’t know which line to choose. All the other points are on exactly one line, so there can be no such difficulties.) Now, since our points are defined by coordinates x, y, and z, our equations for lines and curves should also be in terms of x, y, and z. Before, they were only in x and y. So, we need to figure out how to put z’s into the equations. But we also need to remember that our points only represent lines, so, for example, (x,y,z) and (2x,2y,2z) are the same point. For a random equation f(x,y,z) in terms of x, y, and z, we won’t have f(x,y,z)=0 and f(2x,2y,2z)=0 for all the same values of (x,y,z). Are we in trouble?
No! We just have to restrict our attention to a certain class of functions that do have this property. Suppose every monomial in f(x,y,z) has the same degree, say r. Then, if a is any (nonzero) number, f(ax,ay,az)=a^r*f(x,y,z). Therefore, (ax,ay,az) and (x,y,z) are both 0 at exactly the same points. These are the functions we wish to consider. They are called homogeneous polynomials, of degree r.
The original equations we considered weren’t homogeneous though; we were considering equations like ax+by+c=0 for a line; here the first two monomials have degree 1, but the last one has degree 0. But remember that we wanted to get the z’s into our equations somehow. We can solve both of these problems with one clever stroke: put as many z’s in each monomial as is necessary to make the polynomial homogeneous of minimal degree. In the case of a line, ax+by+c=0, the degree of the entire polynomial is 1, and the degrees of the individual monomials are 1, 1, and 0, respectively. So, we throw in an extra z in the “c” term to make them all have degree 1. This gives us ax+by+cz=0. This is the projective version of a line. Similarly, if we have a parabola, say x^2-y=0, we convert this to x^2-yz=0. This is a projective parabola.
Now we can solve equations almost as usual. So, let’s try to find the intersection of the “parallel” lines x+y=0 and x+y+1=0. First, we have to homogenize them, so the equations become x+y=0 and x+y+z=0. These equations tell us that z=0 and y=-x, so all solutions are of the form (x,-x,0). These points all lie on one line, so this means that there’s exactly one intersection point of the projectivizations of these two lines. Whew!
We pointed out two more types of things that can go wrong: we can have tangents: the parabola x^2-y=0 and the line y=0 only meet at one point (0,0), but we expect there to be two points. (Figure 14) Projectivizing doesn’t help; we get x^2-yz=0 and y=0, so the only point of intersection is the line (0,0,z).
So, we’re back to the drawing board. Let’s try to solve the equations x^2-y=0 and y=0 really carefully. The second equation gives us the y-coordinate for free: it has to be 0. So, let’s substitute y=0 into the first equation. We get x^2=0, so x=0. But wait a minute; I said I wanted to solve these equations really carefully. Why should x be 0? I only know that x^2 is 0. Our previous experience tells us the 0 is the only number whose square is zero. But we’ve been ignoring our previous experience in other occasions when trying to make Bézout’s Theorem work out. And we’ll have to do that here too: we can have numbers other than zero whose squares are zero.
Let’s look at a few more examples. How many times should the curve x^3+2x-y+5=0 and the line 14x-y-11=0 intersect? (Figure 15) They intersect once, normally, at (-4,-67), and again at (2,17). When we solve the equations simultaneously, by solving for y in the second equation and getting y=14x-11 and substituting it into the first, we get x^3-12x+16=0, which factors as (x-2)^2(x+4)=0. Hence, we either have x+4=0 or (x-2)^2=0. In the first case, x=-4, of course, but we want the second one to give us two solutions.
How about the curve x^3-y=0 and the line y=0? (Figure 16) Substituting the second equation into the first gives x^3=0. We want this to count as three solutions.
The correct definition for this so-called intersection multiplicity is fairly complicated and involves a fair amount of linear algebra and ring theory, so in the interest of keeping this exposition as elementary as possible, I won’t give it here. Essentially, we’re saying that something like x^2=0 is just a little bit weaker than saying x=0, and saying x^3=0 is a little weaker still than saying x^2=0. That’s why we might get other weird-looking solutions. (Exercise for those of you who already know algebraic geometry: without mentioning ideals, nilpotents, dimensions, vector spaces, or the other usual ingredients that go into the intersection multiplicity, how do we “define” it, or at least give a really convincing plausibility statement about it? Calculus/formal differentiation is okay. I haven’t convinced myself that this can be done satisfactorily yet.)
Perhaps we’re now fairly happy with the notion of intersection multiplicity. That leaves one final problem. What do we do if the two curves have a piece in common? We mentioned earlier the case of the intersection of a line with itself.
Unfortunately, in the context of Bézout’s Theorem, there’s not so much we can do. We just disallow this case. There is a way of dealing with this situation, but it’s more complicated and involves a very important object called the Chow ring.
Let’s now state the theorem in a reasonably correct form:
Bézout’s Theorem: Let C and C’ be two curves in the complex projective plane, of homogeneous degree d and e, respectively, that have no components in common. If P is a point of intersection of C and C’, let I(P) be the intersection multiplicity of C and C’ at P. Then the sum of the I(P), taken over all the points of intersection of C and C’, is equal to d*e.
I won’t prove Bézout’s Theorem, mostly because I haven’t really managed to define the intersection multiplicity in a satisfactory manner. Its proof can be found in many books on algebraic geometry.
What we have done here is to start with a result we liked, even though it wasn’t true, and change our definitions almost beyond recognition in order to make it true. This process led us naturally to many really good ideas in mathematics: the complex numbers, projective space, possibly aspects of calculus and ring theory. If we were to continue on, trying to remove the condition about having no components in common, we would be led to the notion of a divisor, which is another central topic in modern mathematics.
That’s the way mathematics is done in real life.