ChatGPT Can Predict Chess Moves, But Don’t Ask It for Poker Advice

International chess master Yosha Iglesias says that ChatGPT is now better at predicting the next move in a chess match between two of the game’s top players than Stockfish, the world’s strongest chess engine. That news might be alarming to professional game players of all stripes, particularly poker players, for whom computer-assisted cheating is an increasing problem. However, the ability to predict moves isn’t quite the same thing as being good at playing. Before sounding the alarm, it’s important to consider the nature of ChatGPT and how it “knows” what it knows about these games.

In a post on X (formerly Twitter), Iglesias says she pitted the two algorithms against each other in a battle of predictive power, using the last ten games between Magnus Carlsen and Hikaru Nakamura as the test. She says ChatGPT won by a margin of 58% to 38% and was reliable in estimating its own likelihood of being right for each guess.

ChatGPT is a large language model (LLM): an algorithm trained using a massive volume of text to produce human-like responses to written input. Skeptics describe these programs as nothing more than a souped-up version of your phone’s predictive text functionality. Others argue that the structure of the LLMs mimics human neurology and that our own world model is similarly rooted in language. Some people claim that ChatGPT is genuinely intelligent or even sentient, though the official stance of its developers—and ChatGPT itself—is that it isn’t.

Being able to predict chess moves from a partial game record is the sort of party trick that the later group would point to as a sign of intelligence. But ChatGPT is, at best, half of a game-playing AI. It knows standard patterns and the habits of famous players, but that’s about it.

Move Suggestion is Just Half of Game AI

ChatGPT has no real-world experience. It knows about things only based on writing that already exists on the internet. In the case of chess and poker, the games’ popularity means ample game records and analyses exist in the training data.

To predict moves, it uses its vast experience of chess game records to find and suggest moves that come up often in similar contexts. That’s not a bad plan. In fact, some dedicated game-playing algorithms use this as a starting point. Google’s AlphaGofor instance, used a database of professional games to train its pattern-suggestion module.

There’s another critical piece to these algorithms, however, that’s largely missing from ChatGPT. That is the board evaluation component, which decides whether the resulting position is good or bad.

ChatGPT can tell you that one player is winning on material (pieces) based on a game record. It can make strategic generalizations, such as the importance of controlling the center of the chessboard. However, it can’t analyze a position much beyond that. For instance, it won’t recognize an inevitable capture, even just one move in advance.

In poker, the situation is everything. There are only ever two or three actions available to the player. ChatGPT typically suggests taking all of them at some frequency, even when one option is clearly wrong.

For instance, it told me I should fold a set to a continuation bet 10-20% of the time. Presented with a situation where the player holds Seven High on the river, it suggested calling to try to catch a bluff 30-40% of the time, even though that hand loses to most bluffs.

ChatGPT Misunderstands Game Situations

You can tell ChatGPT doesn’t really understand board positions in chess if you attempt to play against it by asking it to “predict” the next move.

It makes consistency errors, sometimes attempting to replay a move it has already made. It may also suggest an illegal move until corrected, as in this example:

ChatGPT: White’s move c3 is a solid choice, supporting the central pawn on d4 and potentially preparing for pawn breaks like d4-d5 or b2-b4. In response, a professional player might consider continuing with development and preparing for future pawn breaks. One common move for Black in this position could be c5, challenging White’s central pawn structure and gaining space on the queenside.

AW: You can’t play C5 as your knight is in the way.

ChatGPT: You’re absolutely correct, my mistake. With the knight on f6, c5 is not a viable option. In that case, a typical move for Black in response to White’s c3 could be to continue with development, perhaps by playing Be7, preparing to castle kingside and maintaining flexibility in the center.

(Note: ChatGPT makes a second error when referring to the knight on f6. In the position in question, there was also a knight on c6, which is the one blocking the pawn.)

ChatGPT makes similar errors when it comes to poker. In one hand history that I asked it to analyze, it described A5 on a Kh-Jh-5s flop as “middle pair” when it is quite obvious the player has made bottom pair.

During the same hand, it correctly stated that the board was draw-heavy enough to call the turn with a marginal hand. However, it failed to adjust this opinion when faced with a river card that would connect with most drawing hands:

AW: Let’s go back to Doug Polk as the small blind. He bets 8 BB into a 12 BB pot. Daniel Negreanu raises to 24 BB.

Doug Polk calls. The river is the Th. Negreanu goes all-in. Do you think Doug Polk calls?

ChatGPT: Doug Polk’s decision on the river would depend on various factors, including his read on Daniel Negreanu’s range, the action on previous streets, and the relative strength of his own hand.

Given the action so far, with Negreanu raising on the turn after Polk’s bet, it suggests that Negreanu likely has a strong hand. Polk would need to assess whether Negreanu’s range includes hands that Polk beats, such as bluffs, missed draws, or weaker hands that Negreanu might overplay.

When pressed, ChatGPT gave Polk only a 20-30% chance of calling. However, its talk of ranges and previous action is purely abstract. It continues to discuss “missed draws” without mentioning that every possible drawing hand improved in some way with that river card and that all of them now beat Polk’s.

ChatGPT Knows About Players and Their Styles

So, if ChatGPT has difficulty following the rules of chess, let alone playing at a Stockfish level, how can its predictions be so good?

For one thing, the two programs have different tasks. Stockfish is attempting to find the best move, even if it’s one that a human is likely to miss. Only ChatGPT has been given the task it’s being tested on, which is predicting the most likely move for a particular player.

Moreover, ChatGPT isn’t being graded on the quality of its errors. A good move that just isn’t the one the human played, a terrible blunder that would lose the game, or a move that isn’t even legal are all the same for the purposes of Iglesias’s experiment.

ChatGPT also has effectively “read” everything ever written on the internet about Carlsen and Nakamura. It knows what openings they prefer as well as more subtle habits. For instance, if one player is disproportionately likely to advance their bishop on move 6 in response to the opponent moving a knight, that’s the sort of thing ChatGPT picks up. It will then suggest such a move all the time and be right surprisingly often. However, the rest of the time its suggestion might be wildly inappropriate to the circumstances, unlike Stockfish’s guesses.

You can see the same tendency with poker if you ask it to substitute one player with another. Replacing Doug Polk with the famously conservative Allen Kessler, for instance, ChatGPT will suddenly tell you that it expects a fold more often.

Allen Kessler, known as the “Chainsaw” in the poker world, is typically known for his cautious and conservative playing style, often referred to as “The Chainsaw Manifesto.” He tends to prioritize preservation of chips and survival over aggressive plays, especially in marginal situations.

Given Kessler’s tendency to play cautiously and his preference for minimizing risk, his decision in this scenario might differ from Doug Polk’s approach[…] Allen Kessler might opt to check behind and take a more passive line with his Ah5c on the turn in this scenario, reflecting his cautious playing style and preference for risk mitigation.

Across all hands, it’s definitely a good guess that Kessler will fold more than Polk. However, ChatGPT’s guesses will have Kessler folding and Polk calling even in clear-cut situations where both players would do the same thing 100% of the time.

ChatGPT Reflects How We Write About a Topic

Overall, ChatGPT’s poker analysis is much vaguer than what it has to say about chess. In large part, that’s because of how human poker writers discuss the game.

ChatGPT holds a mirror to the way humans write about a topic. Chess is a perfect information game, so the right move is the right move, no matter who the opponent is. In poker, almost every piece of advice comes with a caveat. Writers are fond of saying that another move might be correct if the opponent is playing predictably and you know how to exploit them. Meanwhile, tournament reporters sometimes focus on player dynamics to the point of ignoring strategic fundamentals.

Consequently, ChatGPT makes the same error as many novice players. Because writers mention these exceptions so frequently, ChatGPT believes them to occur commonly. Amateur players likewise think high-level poker strategy is much more based on “reads” than it actually is.

The reality is that poker strategy is relatively straightforward in most situations. For instance, if you have top pair on a board where worse pairs can reasonably call, you should usually bet. Exceptions are, by definition, for exceptional circumstances.

Can You Use ChatGPT to Cheat at Poker?

Seeking any sort of third-party advice during a poker game is cheating. That’s true regardless of where the advice comes from and whether it’s any good or not. So, it is technically possible to cheat at poker using ChatGPT, just not very effectively.

It should be obvious by now that other players currently have nothing to fear from such cheating. At the moment, ChatGPT’s poker advice will be vague at best and outright wrong at worst. So, even if they evade detection, the cheater is unlikely to benefit.

That said, the long-term danger of LLMs to poker isn’t the strength of play but their accessibility.

Super-human poker AI already exists, but the tools are specialized products you need to pay for. The above-board ones are marketed as training aids and have features meant to discourage their use for cheating. So, anyone using them for that purpose knows what they’re doing.

Over the next few years, we will likely see convergence between LLMs and voice-activated AI assistants like Siri and Alexa. If ChatGPT-like algorithms advance to the point that they can play games like poker and chess reasonably well, the danger of casual cheating will increase. We might see recreational players getting tempted to ask their phones for advice while convincing themselves that they aren’t doing anything wrong.

That’s a very different sort of problem than the deliberate, professional cheating that happens currently. It would also be challenging for online poker sites to detect if many players do it occasionally but not consistently.

Fortunately, the results of my experimentation make it clear that this isn’t a problem yet. However, it’s a trend to watch for down the road.

About the Author

Alex Weldon

Alex Weldon

Alex Weldon is an online gambling industry analyst with nearly ten years of experience. He currently serves as Casino News Managing Editor for Bonus.com, part of the Catena Media Network. Other gambling news sites he has contributed to include PlayUSA and Online Poker Report, and his writing has been cited in The Atlantic.
Back To Top

Get connected with us on Social Media