|
Here are my answers for Q1-Q4:
1)
Let x be Hero's hand with x in [0,0.4]
The probability that villain's hand is < x is x - villain's hand is better than ours
The probability that villain's hand is <= 0.4 is 0.4 - villain calls a bet
The probability that villain's hand is < x when he calls a bet is x/0.4 - villain's hand is better when he calls a bet
EV of checking:
(1-x)P
P-xP
EV of betting:
0.6P+0.4((1-x/0.4)(P+1)-x/0.4)
0.6P+0.4(P+1-xP/0.4-x/0.4-x/0.4)
0.6P+0.4P+0.4-xP-2x
P+0.4-xP-2x
So betting is better than checking when:
P+0.4-xP-2x > P-xP
0.4 > 2x
x < 0.2
So when our hand has more than 50% equity against villain's range we can value bet profitably.
2)
Let y be the worst hand that villain calls with, with y in [0,1]
Let x be Hero's hand with x in (y,1]
The probability that villain calls a bet is y
EV of bluffing is <= 0 when
(1-y)P-y <= 0
P-yP-y <= 0
P/(P+1) >= y
y <= P/(P+1)
and since P=0.5, y<=0.3333... so y in [0,0.3333].
Let's also calculate when bluffing is better than checking with a hand x, a pot P and villain calls with hands <=y:
P-yP-y > P-xP
-yP-y > -xP
yP+y < xP
x > y(P+1)/P
3)
From 1) value bet with x < 0.15
From 2) bluff with x > 0.6
So the optimal exploitative range for Hero is [0,0.15) U (0.6,1]
4) Pot odds are 0.3333. Any hand that has at least this much equity vs Hero's range is worth calling a bet with.
So what is h_v's equity vs Hero's range? Hero's range is 0.55 wide (0.15+0.4)
If h_v<=0.15, equity = (0.55-h_v)/0.55 = 1-h_v/0.55, so at the minimum 1-0.15/0.55= 72.7%
If 0.15<h_v<=0.6, equity = (0.55-0.15)/0.55 = 72.7%, so all these hands have the same equity and are worth calling a bet
with
If h_v>0.6, equity = (0.55-0.15-(h_v-0.6))/0.55 = (1-h_v)/0.55. This obviously decreases as h_v increases, and will reach
the 0.3333 equity threshold is reached for h_v=0.8185
So villain should call with [0,0.8185]
extra:
Note that I started doing question 4 with an EV approach before I realized it could be done simpler than that. In the process, I got this:
Let y be the worst hand that villain calls with. We are looking for the optimal y for villain.
Let h_h be Hero's hand when he bets with h_h in [0,0.15) U (0.6,1]
Let's calculate the EV of villain as a function of y.
Let v be the event that h_h is in [0,0.15] when Hero bets.
Let b be the event that h_h is in [0.6,1] when Hero bets.
Let e be the event that h_v < h_h when Hero bets and villain calls with all h_v < y with y in [0,1]. pr(e) is villain's equity.
pr(v) = 0.15/(0.4+0.15) = 0.2727...
pr(b) = 0.4/(0.4+0.15) = 0.7272...
pr(e)=pr(v)*pr(e|v) + pr(b)*pr(e|b) (where pr(i|j) is the probability of i given j)
if y <= 0.15, pr(e|v)=pr(h_h<y)*pr((e|v)|h_h<y) + pr(h_h>=y)*pr((e|v)|h_h>=y)
=y/0.15*0.5 + (1-y/0.15)*1
=1-3.3333*y
if y > 0.15, pr(e|v)=pr(h_v>0.15)*pr((e|v)|h_v>0.15) + pr(h_v<=0.15)*pr((e|v)|h_v<=0.15)
=(y-0.15)/y*0 + (1-(y-0.15)/y)*0.5
=0.5(1-1+0.15/y)
=0.5*0.15/y
=0.075/y
if y <= 0.6, pr(e|b)=1
if y > 0.6, pr(e|b)=pr(h_v<0.6)*pr((e|b)|h_v<0.6) + pr(h_v>=0.6)*pr((e|b)|h_v>=0.6)
=0.6/y*1 + (1-0.6/y) * (pr(h_h<y)*0.5 + pr(h_h>=y)*1)
=0.6/y*1 + (1-0.6/y) * ((y-0.6)/0.4*0.5 + (1-(y-0.6)/0.4)*1)
=2.5-1.25y-0.45/y
Finally, we can calculate villain's EV as:
EV=y*(pr(e)*2-(1-pr(e))
EV=y*(3pr(e)-1)
I could go on an show how to to find the point of max EV by finding the root of the derivative of the above expression, but that's when I realized that there was a much simpler solution. Regardless, I plotted villain's EV relative to y and here it is:

For 5 and 6, I didn't get it because the definition of unexploitative strategy is not 100% clear in my mind. I thought it was the strategy that prevents the opp from making any +EV play? If so, how do you get to your criteria above?
edit: OK I think I get it, let me know if this is wrong. Villain's initial strategy is to call only with [0,4] or x_1=0.4 and Hero is exploiting this and playing an optimal strategy against villain, for which Hero's EV of betting y_0(=0.6) is the same as his EV of checking it, and his EV of betting y_1(=0.15) is the same as his EV of checking it. So at this time, villain is being exploited and sure enough, his EV of calling with x_1(0.4) is not the same as his EV of folding it. When he adjusts his x_1 to 0.8185, he is now playing optimally against Hero's strategy and the roles are reversed: the exploiter becomes the exploited, until Hero adjusts again, and so on. Eventually, they should converge to a stable state where all three conditions you listed are met and where as soon as one of them tries to deviate a hair from his strategy, the other one counter adjusts a bit himself, and they both find themselves dragged back like magnets to the equilibrium.
Now questions: are the equilibria in heads up zero sum games always stable, or are there cases where there are multiple possible equilibria and/or the equilibria are unstable and the players keep bouncing from one to another? Do all these games always converge?
|