Skip to main content



Game-Theoretic Poker Agents

http://poker.cs.ualberta.ca/publications/IJCAI03.pdf

Commentary on paper: This paper explores how game theory can be used to model full-scale poker. The authors begin by pointing to past researchers (in 1950 and 1992) that have attempted to solve the game of poker, but with limited success. Namely, they had to simplify the game of poker insofar that it rendered their solutions only useful in theory, and not in practice. The authors of this 2003 paper describe methods of mathematical abstraction that enabled them to reduce the size of this game from the order of 10^18 down to 10^7, all without losing the most important properties of 2-player Texas Hold’em poker. In short, they experimented with various rule modifications that yielded a sample space suitable enough for computing an exact solution. The exact solutions derived from these abstracted domains provided useful approximations for how an agent should play poker in a practical and game-theoretically optimal way.

How this paper relates to class: I found this paper so interesting because it combines two disciplines that I am very passionate about. For one, this paper briefs us on the origins of game theory and how poker is a game of imperfect information and probabilistic outcomes. While our class will likely not dive into this particular type of game, the imperfect information game trees used to model them still follow the same principles of strategies and equilibrium that we are discussing now. These game trees often have so many nodes, however, that there are not enough computing resources to solve them, and this is where the field of AI comes in. The authors of this paper note attempts to use techniques commonly used to train AI agents (alpha-beta search, Simplex method), but even these fell short for this type of game! This only intrigued me further.

Ultimately, the researchers settled on a more clever variation of the Simplex method that makes the assumption that every player has a “perfect recall” of the strategies used thus far in the game. With this final mathematical abstraction in place, they went about solving their reduced sample spaces and came up with 6 pseudo-optimal agents: three that were experts in pre-flop betting, one trained to play an entire game (Poki), one trained to counter-act Poki’s strategies, and one that can slowly adapt to persistent patterns in play. Pitted against human players, these bots had mixed success, but in the end were able to prevail over some of the top poker players in the world. One of the most interesting comments during these trials was: “The bot has me figured out now.” In reality, none of the bots had robust opponent modeling, but this emotional factor for humans was enough to confuse and derail a professional from his strategies (known as “tilting” a player), leaving him/her victim to the disciplined bots.

Important takeaways: The authors of this paper underscore that a game-theoretic optimal player at its core is NOT designed to win. It is only designed to not lose. Therefore, the mixed strategies that it employs will only surpass human players who are imperfect and make dominated errors. This all ties back into the concept of opponent modeling and, more fundamentally, game modeling in the first place. The pseudo-optimal strategies derived in this paper do no exact opponent modeling, so there is no guarantee that they will always be effective against very bad or highly predictable players. As a result, modern AI agents of poker can only expect modest margins of victory in practice. In the end, proper game models, opponent models, and in-game adaptations will be the key difference in agents that can merely compete with humans and those that can surpass us in skill.

Comments

Leave a Reply

Blogging Calendar

September 2018
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930

Archives