Inside PokerSnowie's brain reveals the work of the Snowie AI Team. It explores first hand how the brain of PokerSnowie evolves and learns advanced strategic concepts, on its own.
PokerSnowie's ultimate aim is to produce the perfectly balanced game, find the ultimate un-exploitable equilibrium for all No Limit Hold'em configurations. Join us on this fascinating journey, which is just starting, into the future of poker.
Is a Game Theory Optimal (GTO) approach exploitable? A simplified example
In any game, there are decisions to be made, and there are payoffs for those decisions. In poker, you get to choose which hands you'll play, how to play them, and how much money to commit with them. Your reward for those decisions is the pot, and your potential cost is how much you've bet in the hand. By contrast, in Rock Paper Scissors (RPS), you simply decide what your next throw is going to be, and your payoff for that decision is winning or losing the throw. How much you win or lose in those games depends on what strategy you use. This is where Game Theory comes in. Game theory optimal is the strategic baseline. It is at the core of all games.
“Optimal” in the context of game theory means “not exploitable”. It may not be the best play or have the highest payoff, but it is the default from which all other strategy is derived. You may already have guessed that the optimal strategy for RPS is choosing each option with a random distribution of [1/3, 1/3, 1/3]. This ensures that no matter what your opponent throws, over the long run you will break even with him. It simply doesn't matter what he does. He can't out-play you or out-think you, he can't read a tell from you, he can't deduce based on your past what you're likely to do—all he can do is break even. Similarly, in poker, there exists a strategy—a set of bet sizes, frequencies, hand ranges, etc.—which cannot be beaten. Whatever your opponent does, over the long run he can only tie with you at best. He can never beat you. And here's where it gets even better: in poker, unlike in RPS, when your opponent deviates from GTO, he loses money to you. In RPS, he still ties no matter what, but in poker, he loses.
To illustrate this, let's create a very simple card game. It's like War, except for a few modifications:
· Ties are ties; there is no “going to war”
· When you win, if you beat your opponent by one rank, you win 5 units, otherwise you win 1 unit
· You and your opponent get separate decks, and you each get to choose your card; it is NOT a random draw
In this modified War game, it is pretty obvious that just picking an ace every time is a really good idea. But what if your opponent hasn't grasped the strategic nuances of this game? What if he really likes picking the seven of clubs, so much so that he picks it 80% of the time? Clearly, your best bet is not to pick an ace, but instead to always pick an eight, winning 5 units from him very often. The strategy of always picking an eight clearly will win more money than picking an ace, but it also is exploitable. If your opponent ever wisens up and starts picking higher cards more often (or God forbid a nine!) you could end up in a lot of trouble.
This is where Game Theory Optimal shines. Optimal play in such a game is to always pick an ace. Against another smart player, you will always tie because he will pick an ace, and against poor players, you will win 1 unit most of the time (or occasionally 5 units whenever the poor soul chooses a king). Against poor players, like our chump who chooses the seven of clubs frequently, by playing optimally you will not win as much as the competent player who deviates from optimal in order to maximally exploit his opponent, but nor will you open yourself up to being re-exploited. While Youtube may be filled with videos of amateurs “hero-calling” and winning 5 units by picking a five versus their opponent's four, the high stakes games will be filled with professionals always picking aces against each other, waiting for rich amateurs to come and play sub-optimal strategy. Then, they can choose if and how much to deviate from optimal, trying to find the right balance between getting as much of the poor player's money as possible and protecting themselves from being exploited by the other professionals.
This is why GTO is important: It is the foundation of all strategy. When you deviate, you open yourself up to exploitation. While you may not choose to play GTO all the time, knowing what it is and how far you are deviating helps you understand where to find potential holes in your game, where you leave yourself unarmed.
Multi-player, deep-stacked No Limit Hold'em may be a far cry from Rock Paper Scissors or War, and the strategy may be much more complex; but the principles are the same, game theory still applies, and an optimal strategy does exist. PokerSnowie has developed a strategy very close to GTO over years of research and a neural network created over trillions of hands. It doesn't know what the Book says to do; it doesn't know what the latest strategy video thinks—all it knows is winning.