I want to understand another toy game that is very similar to the first toy game I studied. Except, the “bluffs” don’t have 0.6% equity; instead, they have 16% equity.
Here are the parameters:
- OOP: AA, A2o, A3o
- IP: KK
- Pot: 50
- Stacks: 950
- Flop: 888 rainbow
Also, there are no bets allowed on the turn and river. Betting and raising is only allowed on the flop. So in some ways it’s a lot simpler than the first toy game.
Here are the preflop ranges:
OOP preflop
IP preflop
Subtree 1 (OOP Bet) Overview
Here’s the OOP strategy on the flop:
OOP strategy
Here’s the IP strategy facing a bet:
IP facing bet
Most of this seems pretty similar to last time - OOP bets with a range that makes KK indifferent to calling or folding. But I want to dig into the details because there are some interesting things I want to point out here.
Subtree 1: Simpler way of thinking about a polarized betting range that makes KK indifferent to a call or fold
How do we figure out the right ratio of nuts and air to make KK indifferent to calling or folding? Before, I used to calculate this by starting from a range that consists of pure nuts and pure bluffs (nuts have 100% equity, bluffs have 0% equity). And from there, calculating the right ratio to make KK indifferent to calling or folding, based on the bet size relative to the pot size. But then, we need to make another adjustment, to factor in the equities, because nuts and bluffs typically don’t have 100% and 0% equity.
Instead of doing all that, what if we just consider the equity of the entire range as a whole?
Against a pot-sized bet, KK need to be good ⅓ of the time to make a break-even call (and be indifferent to calling or folding). So instead of getting all fancy and breaking down bluff ratios and then factoring in equity, we can just say that KK should have 33% equity against a balanced polarized betting range that is betting 1x pot. And indeed, we see that this is exactly true:
Equity KK facing flop bet
Subtree 1: How to figure out the GTO call/fold mixing frequency for KK facing a flop bet
In our first toy game, facing a polarized betting range of AA and offsuit trash, KK mixed at approximately 50%. We figured out the reason for this was because:
- If KK folded more than 50%, OOP could bluff all offsuit trash profitably
- If KK folded less than 50%, OOP would never bluff offsuit trash on that street
So KK mixed at the right frequency to make bluffs indifferent to bluffing or check-folding.
In this toy game, the right mixing frequency seems to be different:
KK mixing frequency against bet
Incorporating the fact that A2o has ~16.4% equity against KK, we can create a formula for the EV when A2o bluffs KK, where:
- X is the frequency with which KK folds
- 1-X is the frequency with which KK calls
The formula is:
- A2o bluff EV = X(50) + (1-X)(0.16)(100) + (1-X)(0.84)(-50)
If we solve this for an EV of 0, KK should fold 34% of the time. However, in the sim, A2o is actually indifferent to bluffing when its EV is the same as a check, not a fold. And the EV on a check is 3.8:
OOP EV Ax
So if we solve the A2o bluff EV expression for 3.8, we get a folding frequency of 39%. Which is indeed the frequency with which KK folds:
KK mixing frequency against bet
Theoretically, if KK fold greater than 39%, A2o should have higher EV on a bet over a check, and should always bet. So I tried node locking KK folding against a bet to 40%, and in response, A2o is indeed always betting:
OOP adjust to KK overfold node lock
KK fold to bet node lock
If we lean the other way, and make KK overcall, then A2o has less EV on a bet over a check, and should always check and never bluff. And it is indeed the case (KK underfolding at 35% frequency):
OOP adjust to KK overcall node lock
KK overcall node lock
So when KK overcall, we are never bluffing with A2o/A3o. It seems this also results in checking back AA a little more so we can continue to realize the equity of our entire checking range (I discuss what this means in the next section). And the rest of the time, AA bet to get value/deny equity from KK.
The high-level takeaway here: even when KK is indifferent to calling or folding facing a polarized balanced range, KK needs to mix at the right frequency, and that frequency is based on making the bluffs indifferent to bluffing when compared to their other actions. If the bluffs aren’t indifferent to bluffing, they’ll either always bluff, or never bluff, based on whether KK is calling too often or too infrequently.
TODO for the future: What happens if we have multiple bluffs with different amounts of equity? Does KK adjust to the weakest bluff, or the average of all the bluffs?
Now, we can move on to the other half of the game tree, which is an OOP check. What variables affect the OOP checking strategy?
Subtree 2 (OOP Check) Overview
Just to review, here’s the OOP strategy on the flop:
OOP strategy
And here’s the IP strategy facing an OOP check (mixing between checking back and betting):
IP facing flop check
And here’s the OOP strategy facing an IP bet:
OOP facing IP bet
Subtree 2 key concept 1: Facing an OOP check, KK are indifferent to checking back or betting
Since there are no more streets of betting after the flop, if KK check
the flop back, it goes straight to showdown. This means both players
just get to realize the equity of their entire ranges. In that case,
the EV of the two players should just be equity * pot
.
And indeed, we see this is true:
KK check back equity
KK check back EV
Ok, so we know what’s going on when KK check back. But what about if KK bet? We know that KK are indifferent to betting or checking here, since it’s mixing its strategy between the two plays. Therefore, the EVs of the two decisions must be the same. Let’s calculate the EV of a KK bet.
When KK bet:
OOP facing IP bet
IP facing checkraise
- it is winning the pot 82% of the time
- it is getting raised with a range that makes it indifferent to calling or folding 18% of the time, making its EV 0, which effectively means it has lost the entire pot, including its bet
So KK EV on a bet is 0.82(pot) - 0.18(bet) = 36.5
. And as
we can see, that is the same as KK EV on a check!
So what’s going on here? OOP has mixed in enough AA into its checking range such that the EV OOP is same whether:
- KK check back, allowing OOP to realize all its equity
- KK bet, forcing OOP to re-polarize
So OOP has mixed in enough AA into its checking range such that OOP realizes the entire equity of its checking range, DESPITE OOP not always reaching showdown with its entire checking range. It’s really a very beautiful concept to see.
One detail is that if KK bets larger, OOP doesn’t need to check as many AA to realize the equity of its checking range. This is because its getting a larger reward (the KK bet) when it does polarize so it doesn’t need to make a polarized raise against a KK bet as often. I tried a sim where I only allowed a pot-sized bet from KK and indeed, AA aren’t checking the flop as much:
KK larger bet nodelock OOP
KK larger bet nodelock
KK larger bet nodelock OOP facing bet
Subtree 2 key concept 2: If KK are indifferent to checking back or betting after OOP checks, how does GTO decide the optimal frequency to mix at?
I think in order to answer this question, we need to think from the perspective of AA. We know the EV of AA when betting - it’s actually higher than the pot:
AA EV on a bet
The reason for this is when OOP polarizes and makes KK indifferent to calling or folding, KK EV is 0, BUT, KK has to defend some-time to keep our bluffs indifferent to bluffing. So when KK defends at this breakeven frequency, AA makes more money than the pot (on average, because KK does have some equity against AA) whenever KK calls.
When AA slowplay, KK have to bet at the right frequency to make AA indifferent to betting the flop or slowplaying. If KK bets too often, AA will always slowplay, because the EV of slowplaying will be higher than betting the flop. If KK bets too infrequently, AA will never slowplay, because the EV of betting the flop (fastplaying?) will be higher. Of course, we must verify.
Normally KK bet 54% of the time when checked to. If we nodelock to 55% (betting too much), we can see AA always slowplay:
KK overbet node lock OOP
KK overbet node lock
KK overbet node lock OOP facing bet
If we nodelock to 53% (betting too little) we can see AA never slowplay:
KK overcheck node lock OOP
KK overcheck node lock
KK overcheck node lock OOP facing bet
Finally I just want to run the math of how we calculate the proper betting frequency of KK when checked to, in order to make AA indifferent to slowplaying (protecting checking range) or fastplaying (betting out on the flop). In order to do that, we need to come up with a formula for AA’s EV on a check, and then solve it for AA’s EV when betting, in order to make AA indifferent to fastplaying or slowplaying.
If:
- X is the frequency with which KK check back after an OOP check
- Y is the frequency with which KK folds to an OOP check-raise
AA EV when checking is:
X(0.917)(50) + (1-X)(Y)(75) + (1-X)(1-Y)(125)(0.917) + (1-X)(1-Y)(-75)(0.083)
In order to solve for X, we need to know Y. Y is the folding frequency of KK facing a check-raise which makes bluffs indifferent to bluffing or folding (EV = 0 both cases). In this case, it happens to be 36%:
KK facing a check raise
If we solve for this checking frequency:
- Y(75) + (1-Y)(0.164)(125) + (1-Y)(0.835)(-75) = 0
- Y = 0.36
So it’s right.
So now that we know how often KK are calling a check-raise, AA EV when checking is:
- X(0.917)(50) + (1-X)(0.36)(75) + (1-X)(0.64)(125)(0.917) + (1-X)(0.64)(-75)(0.083)
- If we solve it for 73 (AA EV on betting flop), we get X = 0.46
Which is indeed how often KK are checking back when OOP checks:
IP facing flop check
So it all checks out.
Closing Thoughts
Honestly this solve has convinced me regarding the merits of exploitative play. What I mean is, there’s actually no such thing as “exploitative play”. What people commonly refer to as exploitative play is just GTO play against a non-GTO opponent. And what people commonly refer to as GTO play is simply GTO play against a GTO opponent. And if you can’t play GTO (i.e. play a profitable strategy) against either of these kinds of opponents, you’re just a fish.
In practice, no one is ever perfectly playing GTO, or even anywhere close to it. So your task should be to figure out in which direction your opponent is straying from GTO and then adjust to maximally exploit it. And in the spots that you do think your opponent is playing close enough to GTO, you can just attempt to replicate GTO yourself.
The thing is, people aren’t straying from GTO in only one spot. They’re straying from GTO in basically every spot. So it’s basically impossible to calculate the perfect strategy against a real opponent because human brains don’t have the capability of solvers.
There’s also another dimension to strategy which is the leveling game. If you can predict that your opponent will predict that you will do something, you can then adjust to your opponent’s prediction and exploit them. Assuming, of course, that you don’t think your opponent will predict your prediction of their prediction. So it’s just like waves repeatedly crashing on a shore - the energy builds up but it has nowhere to go. Eventually it crashes and the cycle is reset. Only to build up again. Forever and ever.
There’s so many more poker toy games I would love to solve (different permutations of position, stack/bet sizes, ranges, flops/runouts, etc) and spend countless hours poring over. The thing is, life is too precious for that. Poker is just some very finite man-made game which people have a silly habit of betting usually-trivial amounts of money on. Poker is on its way out of this world and honestly I should be on my way out from poker. Win the game, or don’t play it.