Poker Toy Game II

I want to understand another toy game that is very similar to the first toy game I studied. Except, the “bluffs” don’t have 0.6% equity; instead, they have 16% equity.

Here are the parameters:

Also, there are no bets allowed on the turn and river. Betting and raising is only allowed on the flop. So in some ways it’s a lot simpler than the first toy game.

Here are the preflop ranges:

OOP preflop

OOP preflop

IP preflop

IP preflop

Subtree 1 (OOP Bet) Overview

Here’s the OOP strategy on the flop:

OOP strategy

OOP strategy

Here’s the IP strategy facing a bet:

IP facing bet

IP facing bet

Most of this seems pretty similar to last time - OOP bets with a range that makes KK indifferent to calling or folding. But I want to dig into the details because there are some interesting things I want to point out here.

Subtree 1: Simpler way of thinking about a polarized betting range that makes KK indifferent to a call or fold

How do we figure out the right ratio of nuts and air to make KK indifferent to calling or folding? Before, I used to calculate this by starting from a range that consists of pure nuts and pure bluffs (nuts have 100% equity, bluffs have 0% equity). And from there, calculating the right ratio to make KK indifferent to calling or folding, based on the bet size relative to the pot size. But then, we need to make another adjustment, to factor in the equities, because nuts and bluffs typically don’t have 100% and 0% equity.

Instead of doing all that, what if we just consider the equity of the entire range as a whole?

Against a pot-sized bet, KK need to be good ⅓ of the time to make a break-even call (and be indifferent to calling or folding). So instead of getting all fancy and breaking down bluff ratios and then factoring in equity, we can just say that KK should have 33% equity against a balanced polarized betting range that is betting 1x pot. And indeed, we see that this is exactly true:

Equity KK facing flop bet

Equity KK facing flop bet

Subtree 1: How to figure out the GTO call/fold mixing frequency for KK facing a flop bet

In our first toy game, facing a polarized betting range of AA and offsuit trash, KK mixed at approximately 50%. We figured out the reason for this was because:

So KK mixed at the right frequency to make bluffs indifferent to bluffing or check-folding.

In this toy game, the right mixing frequency seems to be different:

KK mixing frequency against bet

KK mixing frequency against bet

Incorporating the fact that A2o has ~16.4% equity against KK, we can create a formula for the EV when A2o bluffs KK, where:

The formula is:

If we solve this for an EV of 0, KK should fold 34% of the time. However, in the sim, A2o is actually indifferent to bluffing when its EV is the same as a check, not a fold. And the EV on a check is 3.8:

OOP EV Ax

OOP EV Ax

So if we solve the A2o bluff EV expression for 3.8, we get a folding frequency of 39%. Which is indeed the frequency with which KK folds:

KK mixing frequency against bet

KK mixing frequency against bet

Theoretically, if KK fold greater than 39%, A2o should have higher EV on a bet over a check, and should always bet. So I tried node locking KK folding against a bet to 40%, and in response, A2o is indeed always betting:

OOP adjust to KK overfold node lock

OOP adjust to KK overfold node lock

KK fold to bet node lock

KK fold to bet node lock

If we lean the other way, and make KK overcall, then A2o has less EV on a bet over a check, and should always check and never bluff. And it is indeed the case (KK underfolding at 35% frequency):

OOP adjust to KK overcall node lock

OOP adjust to KK overcall node lock

KK overcall node lock

KK overcall node lock

So when KK overcall, we are never bluffing with A2o/A3o. It seems this also results in checking back AA a little more so we can continue to realize the equity of our entire checking range (I discuss what this means in the next section). And the rest of the time, AA bet to get value/deny equity from KK.

The high-level takeaway here: even when KK is indifferent to calling or folding facing a polarized balanced range, KK needs to mix at the right frequency, and that frequency is based on making the bluffs indifferent to bluffing when compared to their other actions. If the bluffs aren’t indifferent to bluffing, they’ll either always bluff, or never bluff, based on whether KK is calling too often or too infrequently.

TODO for the future: What happens if we have multiple bluffs with different amounts of equity? Does KK adjust to the weakest bluff, or the average of all the bluffs?

Now, we can move on to the other half of the game tree, which is an OOP check. What variables affect the OOP checking strategy?

Subtree 2 (OOP Check) Overview

Just to review, here’s the OOP strategy on the flop:

OOP strategy

OOP strategy

And here’s the IP strategy facing an OOP check (mixing between checking back and betting):

IP facing flop check

IP facing flop check

And here’s the OOP strategy facing an IP bet:

OOP facing IP bet

OOP facing IP bet

Subtree 2 key concept 1: Facing an OOP check, KK are indifferent to checking back or betting

Since there are no more streets of betting after the flop, if KK check the flop back, it goes straight to showdown. This means both players just get to realize the equity of their entire ranges. In that case, the EV of the two players should just be equity * pot. And indeed, we see this is true:

KK check back equity

KK check back equity

KK check back EV

KK check back EV

Ok, so we know what’s going on when KK check back. But what about if KK bet? We know that KK are indifferent to betting or checking here, since it’s mixing its strategy between the two plays. Therefore, the EVs of the two decisions must be the same. Let’s calculate the EV of a KK bet.

When KK bet:

OOP facing IP bet

OOP facing IP bet

IP facing checkraise

IP facing checkraise

So KK EV on a bet is 0.82(pot) - 0.18(bet) = 36.5. And as we can see, that is the same as KK EV on a check!

So what’s going on here? OOP has mixed in enough AA into its checking range such that the EV OOP is same whether:

So OOP has mixed in enough AA into its checking range such that OOP realizes the entire equity of its checking range, DESPITE OOP not always reaching showdown with its entire checking range. It’s really a very beautiful concept to see.

One detail is that if KK bets larger, OOP doesn’t need to check as many AA to realize the equity of its checking range. This is because its getting a larger reward (the KK bet) when it does polarize so it doesn’t need to make a polarized raise against a KK bet as often. I tried a sim where I only allowed a pot-sized bet from KK and indeed, AA aren’t checking the flop as much:

KK larger bet nodelock OOP

KK larger bet nodelock OOP

KK larger bet nodelock

KK larger bet nodelock

KK larger bet nodelock OOP facing bet

KK larger bet nodelock OOP facing bet

Subtree 2 key concept 2: If KK are indifferent to checking back or betting after OOP checks, how does GTO decide the optimal frequency to mix at?

I think in order to answer this question, we need to think from the perspective of AA. We know the EV of AA when betting - it’s actually higher than the pot:

AA EV on a bet

AA EV on a bet

The reason for this is when OOP polarizes and makes KK indifferent to calling or folding, KK EV is 0, BUT, KK has to defend some-time to keep our bluffs indifferent to bluffing. So when KK defends at this breakeven frequency, AA makes more money than the pot (on average, because KK does have some equity against AA) whenever KK calls.

When AA slowplay, KK have to bet at the right frequency to make AA indifferent to betting the flop or slowplaying. If KK bets too often, AA will always slowplay, because the EV of slowplaying will be higher than betting the flop. If KK bets too infrequently, AA will never slowplay, because the EV of betting the flop (fastplaying?) will be higher. Of course, we must verify.

Normally KK bet 54% of the time when checked to. If we nodelock to 55% (betting too much), we can see AA always slowplay:

KK overbet node lock OOP

KK overbet node lock OOP

KK overbet node lock

KK overbet node lock

KK overbet node lock OOP facing bet

KK overbet node lock OOP facing bet

If we nodelock to 53% (betting too little) we can see AA never slowplay:

KK overcheck node lock OOP

KK overcheck node lock OOP

KK overcheck node lock

KK overcheck node lock

KK overcheck node lock OOP facing bet

KK overcheck node lock OOP facing bet

Finally I just want to run the math of how we calculate the proper betting frequency of KK when checked to, in order to make AA indifferent to slowplaying (protecting checking range) or fastplaying (betting out on the flop). In order to do that, we need to come up with a formula for AA’s EV on a check, and then solve it for AA’s EV when betting, in order to make AA indifferent to fastplaying or slowplaying.

If:

AA EV when checking is:

X(0.917)(50) + (1-X)(Y)(75) + (1-X)(1-Y)(125)(0.917) + (1-X)(1-Y)(-75)(0.083)

In order to solve for X, we need to know Y. Y is the folding frequency of KK facing a check-raise which makes bluffs indifferent to bluffing or folding (EV = 0 both cases). In this case, it happens to be 36%:

KK facing a check raise

KK facing a check raise

If we solve for this checking frequency:

So it’s right.

So now that we know how often KK are calling a check-raise, AA EV when checking is:

Which is indeed how often KK are checking back when OOP checks:

IP facing flop check

IP facing flop check

So it all checks out.

Closing Thoughts

Honestly this solve has convinced me regarding the merits of exploitative play. What I mean is, there’s actually no such thing as “exploitative play”. What people commonly refer to as exploitative play is just GTO play against a non-GTO opponent. And what people commonly refer to as GTO play is simply GTO play against a GTO opponent. And if you can’t play GTO (i.e. play a profitable strategy) against either of these kinds of opponents, you’re just a fish.

In practice, no one is ever perfectly playing GTO, or even anywhere close to it. So your task should be to figure out in which direction your opponent is straying from GTO and then adjust to maximally exploit it. And in the spots that you do think your opponent is playing close enough to GTO, you can just attempt to replicate GTO yourself.

The thing is, people aren’t straying from GTO in only one spot. They’re straying from GTO in basically every spot. So it’s basically impossible to calculate the perfect strategy against a real opponent because human brains don’t have the capability of solvers.

There’s also another dimension to strategy which is the leveling game. If you can predict that your opponent will predict that you will do something, you can then adjust to your opponent’s prediction and exploit them. Assuming, of course, that you don’t think your opponent will predict your prediction of their prediction. So it’s just like waves repeatedly crashing on a shore - the energy builds up but it has nowhere to go. Eventually it crashes and the cycle is reset. Only to build up again. Forever and ever.

There’s so many more poker toy games I would love to solve (different permutations of position, stack/bet sizes, ranges, flops/runouts, etc) and spend countless hours poring over. The thing is, life is too precious for that. Poker is just some very finite man-made game which people have a silly habit of betting usually-trivial amounts of money on. Poker is on its way out of this world and honestly I should be on my way out from poker. Win the game, or don’t play it.