Defeating an IM

I hadn’t played a tournament since May, when I had a disastrous tournament and lost four games before withdrawing in disgust. My tournament this weekend was mostly mediocre, except for one game, in which I scored my first win against an IM. (It was a useful moment to start playing like a master myself.) Naturally, that’s the one I’m going to show.

Simon Rubinstein-Salzedo (2060) vs. IM Ray Kaufman (2354)

1. e4 e5 2. Nf3 Nc6 3. d4 exd4 4. Nxd4 Nf6 5. Nxc6 bxc6 6. Bd3

Previously, I had played the main line of the super-sharp Mieses variation of the Scotch, which goes 6. e5 Qe7! (not Nd5) 7. Qe2 Nd5 8. c4 and black plays either 8…Nb6 or 8…Ba6. The game could continue something like this. However, this is the sort of game I’d rather watch than play, being a Boring Positional Player and all that.

So, the quieter and more civilised 6. Bd3 line suits me better and allows me not to memorize gobs of theory, although objectively it offers white somewhat fewer prospects.


This move surprised me. I usually see 6…d5, after which I would continue with 7. exd5, since after 7. e5 Ng4 8. 0-0 Bc5, black is going to be the one who has more fun for a while. Here’s the sort of thing that can go wrong for white. But after 7. exd5, white is slightly better, since eir pawn structure is slightly better.

7. O-O Be7 8. Qe2

Okay, 8. Nc3 is more natural (and more common), but I am not yet sure how I want to develop my queenside pieces: do I put my bishop on g5, or on b2? Do I develop the knight to c3 or d2? If c3, should I play c4 first? But I am certain that I will have to move my queen out at some point, so I might as well do that now and think more about what to do about the rest of my pieces.

8…O-O 9. b3 Nd7!

A very sensible redevelopment. My opponent is planning to play Bf6 next move (whether I play Bb2 or not) and later plant the knight on c5 or e5, where it’s sort of annoying.

10. Bb2 Bf6 11. Bxf6 Qxf6 12. Nd2 Re8 13. f4 Nc5 14. Rae1 Qc3


I think both sides can be relatively satisfied with this position. It’s about equal, and there isn’t anything obvious that either side has to do, so we can just play chess and try to come up with the best ideas we can.

15. Nf3 a5

Okay, so his plan is to weaken my queenside pawns or else open the a-file for his rook. Very sensible. Playing 16. a4 felt wrong, since after 16…Nxd3 17. Qxd3 Qxd3 18. cxd3 Rb8, I’ve given my opponent a target for no good reason. One possibility (depending on where the queens are at the time) is to play b4 after black plays a4 and then follow up with a3 (unless black plays it first). Then we each have a bunch of structural weaknesses that probably balance out. Another option (which the computer recommends) is to play e5 and focus on black’s c-file weaknesses. The position remains balanced.

16. Qd2 Qb4

I’d be satisfied with exchanging queens on d2, but not on b4.

17. c3 Qb6

Not 17…Qa3? 18. Bb1!, and after an eventual e5, the white bishop will be a very dangerous piece.

18. Rf2

But not 18. Kh1? when after 18…a4! I don’t have a satisfactory way of dealing with Nxd3 and Ba6.



Really? That way? I was mostly expecting the bishop to move out to a6 or b7 at some point.

19. Nd4

Aside from the fact that I don’t want to allow him to capture on f3 and slightly damage my kingside pawns (which wouldn’t be too big of a problem), this introduces some interesting tactical possibilities in the next few moves.

19…Nxd3 20. Qxd3 a4

It looks like black is making progress on the queenside. But there’s a small problem.

21. f5!


Suddenly the bishop is in serious danger of getting trapped. The only way out is to play f6 followed by Bh5 and Bf7. Note that the move order is very important: After 21…Bh5? 22. Qh3! g6 23. g4 wins the bishop.

21… axb3!

Unbelievable! He’s just going to let me trap his bishop? Yes, because if I start to trap it with 22. h3, then after 22…bxa2 23. Ra1 c5 24. Nc2 Qb1 25. Rf1 c4 26. Qd2 Qb6 27. Kh1 Bh5 28. g4 Bxg4 29. hxg4, black has lots of compensation for the piece and is better (but possibly not winning). Of course, I didn’t see the whole line, but I saw enough to realize that it was sensible to stabilize the queenside before dealing with the bishop.

22. axb3! c5?

There’s no need to help me reroute my knight to a better square. Now white is substantially better.

Just a few moves ago, we had a relatively quiet position, where it seemed hard for either side to make any seriously dangerous threats. Now the position is very sharp, and seemingly minor inaccuracies like c5 can be fatal.

23. Nc2!

23. Nb5 was briefly tempting, but my move helps the knight get to the square it actually wants to reach: d5.



Of course, I was expecting this move, but it runs into tactical problems…that I completely overlooked. Better was to sacrifice the bishop with 23…Qxb3, which gives black some compensation for the piece, but probably not enough.

24. Ne3?

Looks sensible, but after 24. Qc4+! black is in serious trouble and possibly just lost. The obvious move 24…Kh8? just loses to 25. Ne3!, when the bishop is lost for very little compensation. Better is 24…Kf8, but after 25. e5!, black’s problems continue: 25…Bh5? loses outright to 26. e6, but after 25…Rxe5 26. Rxe5 dxe5 27. Qxg4 Qxb3, black can muddle on. Instead, the computer’s choice is 24…d5, but white just captures with the queen and is up a pawn for no compensation.

After my mistake, black is okay again. At least, in theory: in practice, white’s position is much easier to play.

24…Bh5 25. Nd5 Qb7 26. Qg3!


No more messing around; it’s time to attack the king.


I was expecting this move, but 26…Kf8 avoids some of black’s later problems. I suppose he missed or underestimated my 28th move.

27. Qh4 Bf7?

What else is he supposed to do? 27…c6! Then after 28. Nxf6 gxf6 29. Qxh5 Qxb3, the position is equal. Now white is winning.

28. Nxf6!


This is just winning. I wasn’t so confident during the game, but I couldn’t find a satisfactory defense for black, so I had to play it. But on the other hand, what am I doing!?!?! I’m supposed to be a Boring Positional Player; why am I sacrificing my pieces? (And why does this happen in quite a few of the games I have annotated on my blog recently?)

28…gxf6 29. Qxf6+ Kg8 30. Rf4

The computer tells me that 30. Rf3 is even stronger. Whatever.


30…Qxb3 puts up a bit more resistance, but it’s still lost.

31. Qh6

New mate threat: f6 and Qg7.

31…c6 32. Rf3 h4 33. Rf4 Black resigns


Of course, I’m thrilled with the result of this game, but I’m also quite satisfies with the game itself. It was one of the most interesting games I’ve played, and I’m happy with how I played, even if I did miss a very powerful move on move 24.

I’m hoping this is only the first of many wins over masters on my quest of chess improvement.

Posted in Uncategorized | 2 Comments

Why didn’t Euler conjecture the prime number theorem?

It shouldn’t have been too hard, based on what he knew. Let us take a look.

Euler knew that the sum of reciprocal primes diverged, and he even knew the growth rate of \sum_{p\le x} \frac{1}{p}, where the sum is taken only over primes p. His method of discovering this was rather elegant, even if his lack of respect for divergent series may look horrifying to a modern mathematician. It was already known earlier that the harmonic series grows like \log x; that is,

\sum_{1\le n\le x} \frac{1}{n}\approx \log x,

or more precisely,

\lim_{x\to\infty} \left(\sum_{1\le n\le x} \frac{1}{n}-\log x\right)=\gamma\approx 0.577.

Euler had earlier worked out the famous Euler product for the zeta function:

\sum_{n=1}^\infty \frac{1}{n^s} = \prod_p \left(1-\frac{1}{p^s}\right)^{-1}

for \Re(s)>1. (This formula has a delightful interpretation as an analytic formulation of the fundamental theorem of arithmetic.) Since neither side of the above formula makes sense when s=1, Euler did the most logical thing possible and set s=1 anyway, to obtain

1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\cdots = \left(1-\frac{1}{2}\right)^{-1}\left(1-\frac{1}{3}\right)^{-1}\left(1-\frac{1}{5}\right)^{-1}\left(1-\frac{1}{7}\right)^{-1}\cdots.

Taking logarithms, he then obtained


and expanding out all the logarithmic terms on the right as power series and regrouping, the right side becomes

\left(\frac{1}{2}+\frac{1}{3}+\frac{1}{5}+\frac{1}{7}+\cdots\right) + \frac{1}{2}\left(\frac{1}{2^2}+\frac{1}{3^2}+\frac{1}{5^2}+\frac{1}{7^2}+\cdots\right) + \frac{1}{3}\left(\frac{1}{2^3}+\frac{1}{3^3}+\frac{1}{5^3}+\frac{1}{7^3}+\cdots\right)+\cdots.

Since all the terms except the first add up to something finite, they will not make a significant contribution in the limit, so Euler ignored them. (He could also have truncated after the first term when expanding the logarithms as power series, which would have had the same effect.) Hence, in the limit,

\frac{1}{2}+\frac{1}{3}+\frac{1}{5}+\frac{1}{7}+\cdots = \log\left(1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\cdots\right) = \log\log\infty.

Naturally, that should be interpreted as

\sum_{p\le x} \frac{1}{p}\approx \log\log x.

One can get rid of all the divergent series and weird expressions like \log\log\infty by taking limits and being careful with error terms like a modern mathematician would do. But that wasn’t a necessary part of the culture in the 18th century.

At this point, Euler stopped, but I don’t understand why. It is now possible to make an estimate for the number of primes up to x, using a quick and easy computation. Since I’m a rebellious mathematician, I did the following computation on the front of an envelope.

Let us try to estimate the number of primes between x and kx, for some k>1, and let us write \pi(x) for the number of primes up to x. Then

\sum_{x\le p\le kx} \frac{1}{p}\approx \log\log kx - \log\log x\approx \frac{\pi(kx)-\pi(x)}{x},

where the explanation for the last term is that, if k is not much bigger than 1, then each summand on the left is roughly \frac{1}{x}. The goal is now to determine \pi(kx)-\pi(x). We have

\pi(kx)-\pi(x)\approx x(\log\log kx-\log\log x)=x\log\left(1+\frac{\log k}{\log x}\right).

Expanding the logarithm using a power series and truncating after the first term, we obtain

\pi(kx)-\pi(x)\approx \frac{x\log k}{\log x}.

Now, choosing k=1+\frac{1}{x} and again taking the first term of the power series for the logarithm (that seems to be the name of the game today), we obtain

\pi(x+1)-\pi(x)\approx \frac{1}{\log x}.

Now, summing over x gives

\sum_{x=1}^{n-1} (\pi(x+1)-\pi(x))=\pi(n)\approx \sum_{x=2}^{n-1}\frac{1}{\log x}\approx\int_2^n \frac{dt}{\log t}

(where I wrote 2 instead of 1 as the lower bound for the sum and integral to make them converge). And that’s one version of the prime number theorem. Nothing here would have caused Euler to break a sweat. But yet Gauss seems to have been the first one to conjecture the prime number theorem, and even he apparently did so on the basis of actually working out tables of primes and counting.

Posted in Uncategorized | 4 Comments

The Wallis Product

I had seen the Wallis product, but I did not know a derivation of it until a few weeks ago, when I discovered this gem when reading the wonderful book Sources in the Development of Mathematics by Ranjan Roy. I’m storing this example to use next time I teach calculus, as it requires nothing more than integration by parts.

Like in the case of many other such things in mathematics, I don’t know what the Wallis product actually is; I can only discover it by going through a derivation. So, I won’t give away the answer until we get there. But the goal is to write down an infinite product for \pi.

In order to do this, we evaluate the integral I_n = \int_0^\pi \sin^n x\, dx. The first few are easy:

I_0 = \int_0^\pi dx=\pi

I_1 = \int_0^\pi \sin x\, dx = 2.

I can keep working out more of these, but now it’s time to tackle the general case, using integration by parts. We assume n\ge 2, and we compute \int_0^\pi u\, dv, where u=\sin^{n-1} x and v=\sin x\, dx, so that

I_n = \int_0^\pi \sin^n x\, dx = \left[-\sin^{n-1} x\cos x\right]_0^\pi +\int_0^\pi (n-1)\sin^{n-2} x\cos^2 x\, dx = \int_0^\pi (n-1)\sin^{n-2}x(1-\sin^2 x)\, dx = (n-1)(I_{n-2}-I_n).

Solving for I_n, we get I_n = \frac{n-1}{n}I_{n-2}. Now it’s easy to compute more of the I_n‘s:

I_2 = \pi\times\frac{1}{2}

I_3 = 2\times\frac{2}{3}

I_4 = \pi\times\frac{1}{2}\times\frac{3}{4}

I_5 = 2\times\frac{2}{3}\times\frac{4}{5},

and in general

I_{2n} = \pi\times\frac{1}{2}\times\frac{3}{4}\times\cdots\times\frac{2n-1}{2n}

I_{2n+1} = 2\times\frac{2}{3}\times\frac{4}{5}\times\cdots\times\frac{2n}{2n+1}.

Now, note that for any n, I_{n-1}>I_n>I_{n+1}, since \sin^{n-1} x\ge \sin^n x\ge\sin^{n+1} x whenever 0\le x\le\pi. Furthermore, since \frac{I_{n+1}}{I_{n-1}}=\frac{n}{n+1}, this means that eventually I_{2n-1} and I_{2n} get very close together. So

\pi\times\frac{1}{2}\times\frac{3}{4}\times\cdots\times\frac{2n-1}{2n}\approx 2\times\frac{2}{3}\times\frac{4}{5}\times\cdots\times\frac{2n}{2n+1},


\frac{\pi}{4}\approx \frac{2}{3}\times\frac{4}{3}\times\frac{4}{5}\times\frac{6}{5}\times\frac{6}{7}\times\cdots\times\frac{2n}{2n+1}.

(If we wanted, we could be more precise, and give two inequalities for \frac{\pi}{4}, but it doesn’t add much value, in my opinion, to do that here.)

Taking the limit, we get

\frac{\pi}{4}= \frac{2}{3}\times\frac{4}{3}\times\frac{4}{5}\times\frac{6}{5}\times\frac{6}{7}\cdots.

This is the Wallis product.

Posted in Uncategorized | Leave a comment

One interesting thing: Counting skew ternary trees

On Monday, I attended Don Knuth’s annual Christmas Tree Lecture, where he talks about the most interesting thing he has learned about trees in the past year. This year, he talked about ternary trees and planar graphs. Among other things, he showed us a beautiful theorem with a correspondingly beautiful proof, which I shall reproduce here.

A (nonempty) ternary tree is a tree consisting of a root node, and (possibly) several other nodes, so that each node has at most three children (denoted left, middle, and right). We draw ternary trees like this:


In this ternary tree, A is the root node, and it has a left child (B) and a right child (C), but no middle child. B has a left child (D) and a middle child (E), but no right child. C has a left child (F) and a middle child (G), but no right child. D has only a right child (H). Some nodes (E, F, G, and H) have no children.

If we have a ternary tree, we can assign ranks to the vertices, as follows: the root gets rank 0, and if a node has rank n, then a left child (if there is one) will have rank n-1, a middle child will have rank n, and a right child will have rank n+1. Hence, in the above tree, A and F have rank 0; B, E, and H have rank -1; D has rank -2; C and G have rank 1.

We call a ternary tree a skew ternary tree if all the ranks are nonnegative. The question we’ll answer is, what proportion of ternary trees with n nodes are skew?

Whenever we have such a question and, there is only one correct way to start: calculate a few terms and look up the sequence on the Online Encyclopedia of Integer Sequences. Fortunately for me, after going to the talk, I was able to remember enough of the terms of the total number of ternary trees to find it without having to do any calculation: it is sequence A001764, and it begins 1, 1, 3, 12, 55, 273, 1428, 7752. It tells me that there are \frac{1}{2n+1}\binom{3n}{n} of them. This isn’t too surprising, since the famous Catalan numbers \frac{1}{n+1}\binom{2n}{n} enumerate binary trees (see Richard Stanley’s Enumerative Combinatorics Volume 2, Exercise 6.19(c)).

Unfortunately, the skew trees are not represented on OEIS, but since I remember the theorem, I can calculate them easily: the sequence begins 1, 1, 2, 6, 22, 91, 408, 1938. The theorem is as follows:

Theorem. The ratio of the number of skew ternary trees with n nodes to the number of total ternary trees with n nodes is \frac{2}{n+1}.

I think this is a rather remarkable theorem, because I don’t see any a priori reason to expect any nice relationship between these two numbers. On the other hand, perhaps it’s not so different from the classical Catalan situation: there are \binom{2n}{n} ways to get from (0,0) to (n,n) by taking unit steps either up or to the right, and \binom{1}{n+1}\binom{2n}{n} ways of doing this without every going below the diagonal line y=x.

Before moving on to the proof, I want to pose a question to think about while going through the proof: can a variation of this argument be used to write down a formula for the Catalan numbers? I don’t see it, but I’d love to hear it if you do!

In order to prove the theorem, we start by filling in all the potential children, so that every node has three children or phantom-children. In addition, all nodes except the root have parent nodes, so in order to make the root act more like the rest of them, let’s add a phantom-parent for the root. Let us now count the phantom nodes: each node has exactly four neighbors or phantom-neighbors, so there are a total of 4n edge-germs emanating from all the genuine nodes, put together. There a total of n-1 edges between genuine nodes, and each one counts as two edge-germs, so there are a total of 4n-(2n-2)=2n+2 phantom edge-germs, or $2n+2$ phantom nodes.

Now, we start from the root and trace the outline of the entire tree. Since we are civilised mathematicians, of course we go counterclockwise. As we go, we write down a list of numbers, adding a new one every time we hit a genuine node. It doesn’t matter what number we start with, but Knuth chose -2, so I’ll follow his lead. Now, as we trace, we hit genuine nodes at various times. When we do, we either increase or decrease our running total by one. If we’ve just gone around a phantom node, then we increase the total by one, whereas if we have just come from a (different) genuine node, we decrease it by one.

Let’s see this in action with the example above. After we add the phantom nodes, we get the following diagram:


We start just before the phantom parent-node to node A with -2. Then we move around the phantom parent node. This increases the total by 1, to -1. Continuing along the outline, we next reach node B, which decreases our count to -2. Then we go on to node D, which decreases our count to -3, and so on. Here is a table with the first few values.

-2 -1 -2 -3 -2 -1 -2 -1 0 1

We can also represent this table in pictorial form, just by keeping track of where we are in the running total after each step. Here is the picture for the above graph.


It’s a bit easier to keep track of what’s going on if we overlay this picture on five horizontal lines (also known as a musical staff), so we can see immediately where we are:


Now it’s easy to see that we start at -2 and end at +2 in this case. But in fact, that’s always true: we pass 2n+2 phantom nodes and 2n-2 edges between genuine nodes. The former decrease the count by one, and the latter increase it by 1, so we have a net +4 by tracing the entire tree. So, we always start at -2 and end at +2.

The next observation is that the root of the tree shouldn’t be thought of as being fixed in place. Instead, we can pick a new node to be the root and let everything dangle from there. But we should be a little bit more precise on this point: it’s not quite the root node that we want to choose, but the parent phantom node of the root. Once we have done that, if we keep all the other edges in the right order, we obtain another ternary tree. We can do this with each phantom node, and so we obtain a total of 2n+2 different ternary trees in this way.

The next thing to work out is how the list of running totals changes when we choose a new parent phantom node. But this is not so hard: we see the same increases and decreases starting from some point, but then it must wrap around from the beginning at some point. For example, if we pick the new parent phantom node to be the right phantom child of F, we obtain the following chart (marked with the red rectangle):


So far, we haven’t said anything about what distinguishes skew trees from arbitrary ternary trees, but there is a very nice interpretation of them in terms of the counts: they are exactly the trees that never return to -2. With that observation in mind, the proof of the theorem is now quite easy: since each trace around the outline of the tree increases the count by 4, if we trace the outline of the tree many times, there will be several places where we encounter certain counts for the last time. These will certainly be before phantom nodes, rather than genuine nodes, since genuine nodes decrease the count. Those phantom nodes are exactly the phantom nodes that lead to skew ternary trees if we hang the tree from them. Since the count increases by 4 when we trace the outline once, there will be four such phantom nodes in each trace, out of a total of 2n+2. Hence, the proportion of skew trees among any one such family is \frac{4}{2n+2}=\frac{2}{n+1}, and so the proportion of all ternary trees which are skew is also \frac{2}{n+1}. \blacksquare


So, that’s the proof. Can this method be used to come up with a formula for the Catalan numbers as well? It seems as though this should be possible: in a path from (0,0) to (n,n) with right and up steps, there are a total of 2n edges and 2n+1 vertices. We can add in the phantom neighbors to obtain 4(2n+1)-2(2n)=4n+4 phantom nodes. The goal would be to show that it is possible to hang the picture up from each one of the phantom nodes, so that each one leads to a suitable lattice lath, and of those, exactly 4 of them don’t go below the diagonal. But I don’t know how to do this, so I’d be interested to hear suggestions.

Posted in Uncategorized | Leave a comment

Exchange sacrifices galore

Last weekend, I played the BayAreaChess Thanksgiving tournament. My play was pretty bad, but my results (2 wins, 1 draw, 2 losses against all players rated a bit higher than I) were okay, and I hit a new rating high of 2050 after the tournament. This game, the most exciting but definitely not the best-played, features a total of three exchange sacrifices, followed by a sharp endgame culminating in a pawn race. All in all, it was an amusing if inaccurate game.

Simon Rubinstein-Salzedo (2037) vs. Julian Lin (2106)

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. Be3 e5 7. Nb3 Be6 8. f3
Be7 9. Qd2 O-O 10. O-O-O Qc7 11. Kb1

Unfortunately, I don’t know my lines in the Najdorf as well as I should. The main move is 11. g4, but I was under the impression that white is supposed to wait until black plays Nbd7 to play g4. (At least, that was my rough heuristic.) But my move is okay as well.

11…Nbd7 12. Qf2

But I still didn’t play g4 immediately, because I wanted to stop black from playing Nb6 first.

12…Rfc8 13. g4 Qd8

Black is now planning to follow up with Rxc3. So, at this point I had to decide how to evaluate the position after the exchange sacrifice. I thought that if I could isolate some of black’s pieces on the kingside and out of trouble, then black would fail to get full compensation. So, I was willing to encourage the sacrifice.

14. g5 Nh5 15. Rg1



Exchange sacrifice #1. I was pretty optimistic here, because the knight on h5 isn’t actively helping in the attack for the time being, and the e7 bishop also isn’t currently terribly useful.

16. bxc3 Qc7 17. Qd2

An interesting idea is to clarify matters on the kingside and give up the c-pawn with 17. Qh4. My idea was to give up the dark-squared bishop when the knight goes to b6 (since allowing it to get to a4 is unpleasant) and not worry too much when the other knight goes to f4.

17…Nb6 18. Bxb6

Unfortunately, the e3 bishop was doing double duty: stopping one knight from getting to b6, and the other from getting to f4. But I thought that allowing the knight to get to a4 would lead to very bad things for me.

18…Qxb6 19. Bd3 a5 20. Ka1 a4 21. Nc1 Qc5


This was a crafty move, and it took me a little while to understand what was going on: black threatens to play 22…Qa3.

22. Ne2

The idea is to meet 22…Qa3 with 23. c4 b5 24. Qc3


Another nice move, bringing the bishop into the game.

23. Rb1?

After 23. c4, black is a bit better, but white is in better shape than in the game.



24. Rb4

Exchange sacrifice #2. I was originally planning to play 24. Rb5, but after 24…Qa3, white is simply lost. For example, 25. Rb2 Nf4, and c3 falls. So, this exchange sacrifice was forced.

24…Bxb4 25. cxb4 Qf2 26. Qe1 Qxf3

That was the best I could do: go down a pawn, but my structure is somewhat improved, and at least for the time being, black’s attack has been slowed. Objectively, though, I think my position is lost. So, the goal is to do the best I can setting up a fortress and not making any new weaknesses.

27. h4 Qe3 28. Qc1 Qb6

I was mildly surprised by this decision. Okay, black has some attacking chances still, but I expected that my opponent would prefer an endgame with a clear extra pawn. And, to be honest, I was hoping that he would trade queens, given that I have worked quite hard on endgames recently and thought I might hold, even if I didn’t really deserve to.

29. a3 Rc8 30. Rf1 g6 31. Qd2 Bg4!


Good idea: my bishop is relatively useless, being blocked in by my own pawns. (Capablanca’s rule: when you have a bishop, you should put your pawns on the opposite colored squares.)

32. Rd1 Bxe2 33. Bxe2 Nf4? 

This is what I expected, and it’s the most logical move, but 33…Ng3 simply wins another pawn. After this inaccuracy, I can put up some serious resistance.

34. Bg4! Rd8 35. c4

I had been counting on the strength of this pawn structure for some time. It’s no longer easy for black to break through, since d5 never works, b5 probably doesn’t work, a4 might be weak, and white might be able to play c5 in certain circumstances. At this point, I thought I had reasonable chances to hold.

35…Qc6 36. Qc2 Ra8 37. Rc1 b6

This attempt to get the rook back to the c-file was logical enough, but it’s not clear what the point is: I just keep my queen and rook on the c-file and shuffle my king, and how is black going to do anything?

38. Kb1 Kg7 39. Ka1 Ra7 40. Kb2 Ra8

So, he figured out that that idea wasn’t going to work. But what is happening next?

41. Ka1 Rh8


Yeah, that seems like a better plan. For some time, I had been thinking about playing h5 gxh5 Bf5, but now is my last chance to do it.

42. h5 gxh5 43. Bf5 h6 44. gxh6+ Kxh6? 45. Qh2?

A better move is 45. Rg2, trapping the king on the h-file. The idea is that, while there are several things black wants to do, it’s hard to do all of them: if the queen goes to a8 to prepare Rg8 and a rook trade, then I have Qd2, hitting the d-pawn. If the queen stays on c6, then I can play Qh2 and Qg3 (sacrificing the c-pawn), setting up threats on the g-file.

45…Rg8 46. Qh4 Rg5 47. Qh2 Rg2 48. Qh4 Rg5 49. Qh2


I’m not going to complain about a draw here. But he won’t have any of that.


Exchange sacrifice #3. This is not the best way to continue though: 49…Rg2 50. Qh4 Kg7 51. Rc3 Ne2 is completely winning for black. His move might be winning too, but it gives me more chances. One point to notice is that I’m not seriously threatening anything: he could just take some time to put his king in a safer spot before continuing, and I wouldn’t be able to do anything in the meantime.

50. exf5 Qf3 51. Qa2 Qe3 52. Rd1 Qc3+?

It’s better to win the f-pawn first with 52…Kg5 and then push the h-pawn. At the very least, it buys time for a pawn race, which, as we shall see, is very important later in the game.

53. Kb1 Nd3?


It’s probably still a draw with best play, but now white has most of the winning chances.

54. Qd2+ Qxd2 55. Rxd2 e4 56. Re2 Kg5 57. Rxe4 Kxf5 58. Rd4 Ne5

I didn’t know what was going on here during the game, but I knew what I had to do: activate my queenside pawns before black’s kingside pawns get too dangerous. First plan of action: defend my c-pawn with my king so that I can win the d- and b-pawns, and try to do so quickly enough that I can still stop the black pawns on the kingside. But will it work?

59. Kc2 Kg5 60. Kc3


So far, we have stayed in drawing territory. But now black has a decision to make: which pawn to push? Is it better to have two relatively advanced pawns, or one very advanced pawn?


It’s sometimes hard to figure out what to do when you have two passed pawns, but here’s a pretty good heuristic: bet everything on the one that’s the most advanced. This heuristic would lead to the correct move in the case: 60…h4, which draws. For example 61. Rxd6 h3 62. Rd1 Kf4 63. Kd4 h2 64. Kd5 Ng4 65. Rh1 Ne3+ 66. Kc6 Kg3 67. c5 bxc5 68. b5 Nc4 69. Kxc5 Ne5, and black can stop the b-pawn with the knight, so the game should end in a draw.

61. Rxd6 h4 62. Kd4?

In turn, I’m not sure what I missed, but 62. Rxb6 h3 63. Rd6 h2 64. Rd1 is pretty clearly winning for white. I guess the pressure of a long game was taking a toll on both of us by this point.


Better is 62…Nf3+!, because there are too many forks lurking.

63. c5!?

This is probably winning, and it’s very exciting to go into a pawn race and get some new queens on the board, but 63. Rxb6 h3 64. Re6 is a much easier and less stressful way to win.

63…h3 64. cxb6

Not 64. c6?? h2 65. c7 h1Q 66. c8Q Qd1+ 67. Kc4 Qc1+, winning the queen.

64…h2 65. b7 h1=Q 66. b8=Q


I evaluated this as probably winning, but I’m not sure, and anyway with limited time remaining, I didn’t know what would happen. After the expected 66…Qc1, it’s still a tough game. Fortunately for me, though, he quickly went astray.


Maybe this isn’t a horrible move objectively, but it puts the queen in a vulnerable spot, and that’s not a smart thing to do in a complicated position after 5 hours with little time remaining.

67. Kc5 Ne5??

But this loses instantly.

68. Qd8+ Kg4 69. Rd4 1-0

My opponent deserves a lot of credit for extracting a lot from an exchange sacrifice that I didn’t initially believe in (Rxc3). I knew that in the Dragon, Rxc3 always works, but I was more skeptical about its strength in the Najdorf. I guess now I know better to allow it in the future. I also need to know my lines in the Najdorf better if I’m going to continue to play open Sicilians: it’s simply irresponsible to enter into such sharp lines without being well-prepared. (In fact, all three of my white games in this tournament featured the Najdorf. It was only in the third game that I responded relatively well.) Probably I would see the most benefit in my chess ability if I were to work on my openings and learn some systems properly. But how do I do that in the most effective way possible? I could memorize a bunch of lines blindly à la Moonwalking with Einstein using memory palaces or something similar, but I have fundamental objections to using such techniques, since they cause people to remember things for all the same reasons. Anyway, I’m constantly telling my students not to memorize things. Alternatively, I could play through a bunch of games and hope that the lines stick, but usually they don’t, as there is so much to learn. So, how do you learn openings?

Posted in Uncategorized | Leave a comment

One interesting thing: Average polynomial time algorithms

The most celebrated problem in theoretical computer science is undoubtedly the P vs. NP problem. Let me explain what it is asking. (If you already know this, feel free to skip the next few paragraphs.) Suppose we have a general type of problem that we wish to solve. (Think of something like determining if a number is prime, or factoring it.) There are many instances of such a problem. (For example, there are lots of numbers out there, and we could ask for each number whether or not it is prime.) Now, suppose we have a computer program that solves this type of problem: it runs for a while, and eventually tells us the answer. The amount of time it takes to run depends on the complexity of the specific problem we give it: for example, we should expect it to take longer to determine whether a very large number is prime than it would take for a small number, although there are some exceptions. (It would be very fast to say that a large number is composite if it is even, for instance.)

For the most part, then, the amount of time it takes to run the program depends on how complicated the particular instance of the problem is. We’ll measure that in terms of the number of characters needed to enter it. So, we can ask how long it takes to determine if an n-digit number is prime, for example. (Answer: not very long, surprisingly.)

Normally, we aren’t particularly interested in exactly how long it takes to run in terms of number of seconds, because that changes from one computer to another. Instead, we’re interested in how the length of time grows as the length of the input grows. We say that a program runs in polynomial time if there is some polynomial P(n) so that the amount of time to solve any instance of the problem with an input of length n is at most P(n). Observe that this notion does not depend on the speed of the computer; for instance, if we switch to a computer that runs half as fast, that’s equivalent to multiplying P by 2. We call a problem a P problem if it has an algorithm that solves it in polynomial time.

Now, let us consider the notion of NP, which stands for nondeterministic polynomial time. What this means is that, if I magically tell you the answer to a problem (but I might be lying to you), you can check whether I’m telling you the truth in polynomial time. A classic example of this is factoring. If I ask you to factor 51445627, you are unlikely to be able to do that quickly (although there are some very clever algorithms for doing it). But if I tell you that the prime factors are 5689 and 9043, then you can check this quickly, in polynomial time in the length of the original number. Note that all P problems are also NP problems: if you can solve the problem on your own in polynomial time, you can still solve it in polynomial time if I tell you the answer, for example by ignoring my answer.

The P vs. NP problem asks if P = NP. That is, can every problem that can be checked in polynomial time also be solved in polynomial time? Most people believe the answer to be no, but we have made little progress toward a proof.

Note that the property of being in P depends on all instances of the problem. However, computer scientists are a bit different from mathematicians: they don’t usually need to prove things; they just want to get right answers. (Probably not all of them, but that’s my view of it from a mathematician’s world.) So, might it be that there problems that can’t always be solved in polynomial time, but usually can?

This is the subject of Russell Impagliazzo’s very nice survey paper “A Personal View of Average-Case Complexity.” He rather amusingly frames the problem in terms of the famous story about a very young Gauss being asked to sum the numbers from 1 to 100, and the teacher’s subsequent attempts to humiliate Gauss in revenge for his quick solution. (Note: I don’t do that to my students. In fact, I like it when they are clever.)

In it, he includes a definition for an average polynomial time algorithm, which essentially means that if we were to choose a random instance of the problem, we would expect to be able to solve it in polynomial time. But before we can make that more precise, it is necessary to explain what we mean by a random instance of the problem. In order to do that, it makes sense at first to restrict to a given input size n. In order to select a random instance of a problem of size n, we need to put a probability distribution \mu_n on the instances of the problem of length n. Then, intuitively, to say that an algorithm is an average polynomial time algorithm is to say that there is a polynomial P(n) so that the algorithm solves nearly all instances of the problem of length n, chosen according to \mu_n, in time at most P(n), and the “exceptional” cases get increasingly rare as n gets larger. (The real definition is more precise, but this captures the essence of it.)

Now, one might ask whether there are any problems for which there exist average polynomial time algorithms but not polynomial time algorithms. Since we don’t know whether P = NP or not, it’s hard to be sure, but one problem that is very strongly believed not to be a P problem is the Hamiltonian path problem. Let us explain what it says.

A path in a graph is a sequence of edges so that each edge (other than the first) starts where the previous one ended. (So that we can imagine walking on it if we were on the graph.) A path is called a Hamiltonian path if it goes through each vertex once, without any repetitions.

The Hamiltonian path problem asks if, given a graph, we can determine whether there is a Hamiltonian path. This problem is definitely an NP problem: if someone tells us that there is a Hamiltonian path and tells us what it is, then we can quickly check whether it actually is. (However, if someone tells us that there is no Hamiltonian path, it is not clear how we would verify that quickly, and indeed we expect not to be able to do so.)

It is widely believed that the Hamiltonian path problem is not a P problem; in fact, if it were, then we would also have P = NP, since it is possible to convert any NP problem quickly into a Hamiltonian path problem.

So, we believe that we can’t solve the Hamiltonian path problem quickly in general. But what about for most cases?

In order for that question to make sense, we need to put a probability distribution on graphs with n vertices, so that we can sample them. One way of doing that is with the Erdős–Rényi model that I discussed last week: we pick a random graph by saying that any two of its vertices will be connected with probability p, independently of all the other pairs.

So, let’s fix a p and let n vary. Then, it turns out that there is an algorithm that can determine whether a graph has a Hamiltonian path in almost polynomial time. The algorithm is given in detail in Yuri Gurevich and Saharon Shelah’s paper “Expected Computation Time for Hamiltonian Path Problem,” but I’ll sketch the basic idea, which is very simple. (What they do is somewhat more subtle, but here’s my personalized version of it, which probably doesn’t actually work very well.) It runs in three steps:

  1. Choose a potential starting point and a potential ending point. (There are only a polynomial number of pairs to choose, so we can run through all such pairs in polynomial time.) Try to make a Hamiltonian path by making a path from the starting point that’s as long as possible, just by trying to add one vertex at a time. Once we get into trouble, do the same thing starting with the ending point. Try to make minor modifications to splice these two pieces together.
  2. If that doesn’t work, try to show there is no Hamiltonian path. To do this, identify troublesome vertices: the ones that have relatively small degree. See if there is a contradiction involving these problem vertices.
  3. If that doesn’t work, simply check every possible path and see if it gives a Hamiltonian path.

Steps 1 and 2 can be done in polynomial time; it is only step 3 that can’t. But it turns out that we won’t reach step 3 very often, because most of the time the algorithm will have finished after step 1 or 2.

Okay, so it’s an example, although I don’t find it particularly satisfying: as n gets large, almost all graphs chosen in this way have Hamiltonian paths, since they have so many edges! I’d be very likely to find one in Step 1, which just tries things at random. Thus, I could make an algorithm that says there is always a Hamiltonian path, and it will run very quickly and only very rarely be wrong.

So, I am curious to know whether there are more interesting examples of problems with average polynomial time algorithms but that are not expected to have worse-case polynomial time algorithms.

Posted in Uncategorized | Leave a comment

One interesting thing: Erdős–Rényi random graphs

I’ve been thinking a lot about random graphs this week, so it is appropriate for my post this week to be about them. One of the most important models of random graphs is the Erdős–Rényi model. Here is the construction: we fix some parameter p with 0<p<1, and we choose some positive integer n. Then, we draw n vertices. Between each pair of vertices, we put an edge there with probability p, independently of all the other pairs. I’ve been thinking about problems relating to evolutionary dynamics on such graphs, but I won’t describe that today. (Maybe I will at some point, when I understand them better.)

Instead, I just want to talk about the graphs themselves, since there are a lot of questions one can ask about the structure of them. What do they look like? In particular, are they likely to be connected?

The answer is that it depends on how large p is relative to n. There is a phase change at p=\log(n)/n. In fact, we have the following theorem (which is not quite the best possible):

Theorem (Erdős–Rényi). For any \varepsilon>0, we have the following: 

  1. If p>(1+\varepsilon)\frac{\log(n)}{n}, then the probability that the graph is connected goes to 1, as n goes to infinity.
  2. If p<(1-\varepsilon)\frac{\log(n)}{n}, then the probability that the graph is connected goes to 0, as n goes to infinity.

I’m not going to give a proof here, but a proof can be found on this MathOverflow page.

But what happens when the graph is disconnected? To what extent does it fail to be connected? There is a very nice answer to this question as well:

Theorem. Suppose that c=np is a constant. Then:

  1. If c<1, then with high probability all the connected components are small: they have size O(\log(n)).
  2. If c>1, then with high probability there is exactly one large connected component, of size \beta(c) n+o(n), for some number \beta(c)>0, and the rest of the connected components have size O(\log(n)).

The large component when c>1 is called, suitably, the Giant Component. A proof can be find here.

There is a lot more than one can say about the Erdős–Rényi model for random graphs, but this will be enough for now.

Posted in Uncategorized | 1 Comment