• OpenAI’s o3 defeated Elon Musk’s Grok 4 at chess
  • Magnus Carlsen delivered biting commentary on the quality of Grok’s logic
  • Grok 4 made repeated blunders, while o3 played steady

The AI chess tournament between OpenAI’s o3 model and xAI’s Grok 4 invited plenty of speculation as a kind of proxy battle between the two companies and their respective CEOs. Any comparison to the days of Deep Blue and Bobby Fischer soon faded, though, as OpenAI o3 repeatedly wiped out Grok 4, winning four games in a row, accompanied by the derisive commentary of former world chess champion Magnus Carlsen and grandmaster David Howell.

The showdown happened on Kaggle’s Game Arena, a digital coliseum where AI models battle in chess and other games. The tournament featured eight of the most prominent LLMs in the business: OpenAI’s o3 and o4-mini, Google’s Gemini 2.5 Pro and Flash, Anthropic’s Claude Opus, Moonshot’s DeepSeek and Kimi, and xAI’s Grok 4. The final came down to Grok and o3, but Grok’s performance in the final round didn’t seem like a battle of champions.

Carlsen and Howell veered between serious commentary and a roast as Grok’s performance came off as somewhat erratic. In the first game, it quickly sacrificed its bishop, then began trading pieces like it was in a hurry to go home. Things didn’t improve in the next game for Grok.

“[Grok] is like that one guy in a club tournament who has learnt theory and literally knows nothing else,” Carlsen said during the second game. “Makes the worst blunders after that.”

Grok’s performance was so off-the-rails that Carlsen rated it around 800 ELO, or slightly above a beginner. He gave o3 a modest but respectable 1200, in the middle of most hobby players. Though o3 didn’t play brilliantly, it didn’t have to. It played solid chess. It didn’t blunder pieces. It converted its advantages and carried out the classic chess moves.

“o3 is fairly ruthless in conversions; it looks like a chess player. Grok looks like it learnt a few opening moves and knows the rules, but not much more.,” Carlsen said. “Grok’s moves are chess-related moves. They just came at the wrong time and in weird sequences.”

Chess AI

The chess wasn’t the main point of the tournament, despite its prominence. It was about how general-purpose AI models handle events with strict rules like chess games. Turns out, they’re not great, but o3 is the best of the limited sample. As AI becomes embedded in everything, the ability to follow rules and spot patterns becomes essential. Chess is a uniquely transparent way to observe that. You either made the right move or you didn’t. When a model plays well, you can see the logic; otherwise, queens fall like dominoes, and the game becomes as confused as that metaphor.

Chess is a window into how well an AI can plan, evaluate options, avoid catastrophic mistakes, and stay logically consistent. If Grok throws away a queen because it doesn’t grasp long-term consequences, what might it do in a legal document, or when booking travel?

That the final was between OpenAI and xAI did add some drama with Sam Altman and Elon Musk at loggerheads in public. The chess final didn’t resolve the battle between them, but it did give OpenAI a PR win in the realm of public perception, and a limited but very real compliment from Magnus Carlsen.

You might also like

By admin