Machines for Math

I came across this article today while looking for information on how Jensen and Williamson implemented an algorithm for computing the p-canonical basis. Not recognizing any of the co-authors I assumed it was a group of students until coming across this paragraph.

In this work we prove a new formula Kazhdan-Lusztig polynomials for symmetric groups. Our formula was discovered whilst trying to understand certain machine learning models trained to predict Kazhdan-Lusztig polynomials from Bruhat graphs. The new formula suggests an approach to the combinatorial invariance conjecture for symmetric groups.

Turns out the co-authors are Google I mean DeepMind people. This seemed kind of wild to me — to throw machine learning at something like computing KL polynomials, and I guess it is sort of radical. At least enough that they published another paper in Nature describing this and other efforts at using machine learning to formulate or solve conjectures. And U. Sydney is going on a PR push about it.

Reading the Nature article, it is clear that this isn’t really the doomsday scenario we worry about where machines can prove all the theorems and we mortals are useless to them. It sounds like there’s still a high degree of specialized knowledge and intuition that goes into training the supervised learning model. At least that’s my sense from this paragraph:


We took the conjecture as our initial hypothesis, and found that a supervised learning model was able to predict the KL polynomial from the Bruhat interval with reasonably high accuracy. By experimenting on the way in which we input the Bruhat interval to the network, it became apparent that some choices of graphs and features were particularly conducive to accurate predictions. In particular, we found that a subgraph inspired by prior work may be sufficient to calculate the KL polynomial, and this was supported by a much more accurate estimated function.

I know not much about machine learning, and the second sentence doesn’t give a clear sense of how the “experimenting on” inputting went, but I would be surprised if the machine was able to actually pull out hypercube decompositions and diamond completeness as the features it needed to really predict the KL polynomials. That is, it sounds like ML could be a useful interactive tool for discovering and verifying ideas, like interactive theorem proving seeks to be, but they’re still not doing the math for us. Not to mention the proof of the main formula of the paper involves some layers of categorification and pretty heavy geometry.

Still, all in all, it looks like we’re gonna all have to learn how machines learn sooner or later.