- OpenAI's o3 model won a five-day poker tournament featuring nine AI chatbots.
- The o3 model won by playing the most consistent game.
- Most of the top language models were good at poker, but struggled with bluffing, position, and basic math.
In a digital showdown unlike anything ever seen on the cloth, nine of the world's most powerful big tongue models spent five days in a high-stakes poker match.
o3 by OpenAI, Claude Sonnet 4.5 by Anthropic, Grok by X.ai, GoogleMeta's Gemini 2.5 Pro, Meta's Llama 4, DeepSeek R1, Moonshot AI's Kimi K2, Mistral AI's Magistral and Z.AI's GLM 4.6 played thousands of hands of no-limit Texas hold'em at $10 and $20 tables with bankrolls of $100,000 each.
When OpenAI's o3 model was $36,691 richer after playing poker for a week, she had no trophy, just something to brag about.
The experimental PokerBattle.ai was entirely AI driven and gave every player the same starting hint. It was pure strategy, if strategy is what you call thousands of micro-decisions made by machines that don't actually understand how to win, lose, or how to humiliatingly lose with seven deuces.
For a technical trick, this was unusually revealing. The most effective AIs didn't just bluff and bet—they adapted, simulated their opponents, and learned to deal with uncertainty in real time. While they didn't play poker flawlessly, they came impressively close to emulating the judgment of experienced players.
OpenAI's O3 quickly showed he had the most consistent hand, taking three of the five biggest pots and sticking to textbook preflop theory. Anthropic's Claude and X.com's Grock round out the top three with significant earnings of $33,641 and $28,796, respectively.
Meanwhile, Llama lost her full stack and was eliminated early. The rest of the group fell somewhere in the middle, with Google's Gemini making a modest profit and Moonshot's Kimi K2 falling to $86,030.
Gambling artificial intelligence
Poker has long been one of the best analogues for general purpose AI testing. Unlike chess or Go, which rely on precise information, poker requires players to reason under conditions of uncertainty. It's a mirror of real-world decision-making in everything from business negotiations to military strategy, and now it appears chatbot development.
One of the consistent findings from the tournament was that the bots were often too aggressive. The most preferred action-packed strategies, even in situations where it would be wiser to fold. They tried more to win big pots than to avoid losing them. And they were terrible bluffers, not because they didn't try, but because their bluffs were often the result of misreading hands rather than clever deception.
However, AI tools are getting smarter and going well beyond surface intelligence. They don't just repeat what they read; they make probabilistic judgments under pressure and learn to read a situation. It's also a reminder that even powerful models have flaws. Misreading a situation, drawing shaky conclusions, and forgetting your own “position” is not just a poker problem.
You may never have to sit in front of a language model in a real poker room, but chances are you will interact with it when trying to make important decisions. This game was just a glimpse of what that could look like.
Follow TechRadar on Google News. And add us as your preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the “Subscribe” button!
And of course you can also Follow TechRadar on TikTok for news, reviews, unboxing videos and get regular updates from us on whatsapp too much.
The best business laptops for any budget.






