Thirty Mathematicians Took On AI In a Battle For the Ages

Humans have competed against machine intelligences since before AI was a highly-touted buzzword. Human chess players have competed against software for decades now — with computerized opponents establishing a long string of dominance over their organic counterparts. The latest front in competition between man and machine took place in a different arena — and its outcome has plenty to say about where AI might be going.

Writing at Live Science, Lyndie Chiou discussed the nature and terms of the competition, which took place in Berkeley earlier this year. The competition involved OpenAI’s o4-mini model, for which mathematician Elliot Glazer assembled a list of math problems of varying difficulties for it to solve. Working with Epoch AI, Glazer offered mathematicians an incentive to come up with problems o4-mini could not solve: $7,500 per question.

The latest iteration of this challenge took place over a weekend in May, and saw 30 mathematicians gather in California where they split up into teams of six. Turns out that this latest AI model did far better at answering questions than its predecessors, with Chiou describing the chatbot as having “unexpected mathematical prowess.”

$Pandemic Learning Loss Is Affecting College Students’ Math Skills$

Pandemic Learning Loss Is Affecting College Students’ Math Skills

Student and faculty alike are concerned

In the end, the assembled mathematicians were able to come up with 10 problems that the chatbot was not able to solve. What the results of this competition could mean is still up for debate, though. “If you say something with enough authority, people just get scared,” mathematician Yang-Hui He told Live Science. “I think o4-mini has mastered proof by intimidation; it says everything with so much confidence.”

Meet your guide

Tobias Carroll

Tobias Carroll lives and writes in New York City, and has been covering a wide variety of subjects — including (but not limited to) books, soccer and drinks — for many years. His writing has been published by the likes of the Los Angeles Times, Pitchfork, Literary Hub, Vulture, Punch, the New York Times and Men’s Journal. At InsideHook, he has…