Select Page
Poker Forum
Over 1,292,000 Posts!
Poker ForumBeginners Circle

spoonitnow's 5000th Post: Game Theory and Poker

Results 1 to 51 of 51

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    daviddem's Avatar
    Join Date
    Aug 2009
    Posts
    1,505
    Location
    Philippines/Saudi Arabia
    Here are my answers for Q1-Q4:

    1)
    Let x be Hero's hand with x in [0,0.4]

    The probability that villain's hand is < x is x - villain's hand is better than ours
    The probability that villain's hand is <= 0.4 is 0.4 - villain calls a bet
    The probability that villain's hand is < x when he calls a bet is x/0.4 - villain's hand is better when he calls a bet

    EV of checking:
    (1-x)P
    P-xP

    EV of betting:
    0.6P+0.4((1-x/0.4)(P+1)-x/0.4)
    0.6P+0.4(P+1-xP/0.4-x/0.4-x/0.4)
    0.6P+0.4P+0.4-xP-2x
    P+0.4-xP-2x

    So betting is better than checking when:
    P+0.4-xP-2x > P-xP
    0.4 > 2x
    x < 0.2
    So when our hand has more than 50% equity against villain's range we can value bet profitably.

    2)

    Let y be the worst hand that villain calls with, with y in [0,1]
    Let x be Hero's hand with x in (y,1]

    The probability that villain calls a bet is y

    EV of bluffing is <= 0 when
    (1-y)P-y <= 0
    P-yP-y <= 0
    P/(P+1) >= y
    y <= P/(P+1)

    and since P=0.5, y<=0.3333... so y in [0,0.3333].

    Let's also calculate when bluffing is better than checking with a hand x, a pot P and villain calls with hands <=y:
    P-yP-y > P-xP
    -yP-y > -xP
    yP+y < xP
    x > y(P+1)/P

    3)
    From 1) value bet with x < 0.15
    From 2) bluff with x > 0.6

    So the optimal exploitative range for Hero is [0,0.15) U (0.6,1]

    4) Pot odds are 0.3333. Any hand that has at least this much equity vs Hero's range is worth calling a bet with.
    So what is h_v's equity vs Hero's range? Hero's range is 0.55 wide (0.15+0.4)

    If h_v<=0.15, equity = (0.55-h_v)/0.55 = 1-h_v/0.55, so at the minimum 1-0.15/0.55= 72.7%

    If 0.15<h_v<=0.6, equity = (0.55-0.15)/0.55 = 72.7%, so all these hands have the same equity and are worth calling a bet

    with

    If h_v>0.6, equity = (0.55-0.15-(h_v-0.6))/0.55 = (1-h_v)/0.55. This obviously decreases as h_v increases, and will reach

    the 0.3333 equity threshold is reached for h_v=0.8185

    So villain should call with [0,0.8185]

    extra:
    Note that I started doing question 4 with an EV approach before I realized it could be done simpler than that. In the process, I got this:
    Let y be the worst hand that villain calls with. We are looking for the optimal y for villain.
    Let h_h be Hero's hand when he bets with h_h in [0,0.15) U (0.6,1]

    Let's calculate the EV of villain as a function of y.

    Let v be the event that h_h is in [0,0.15] when Hero bets.
    Let b be the event that h_h is in [0.6,1] when Hero bets.
    Let e be the event that h_v < h_h when Hero bets and villain calls with all h_v < y with y in [0,1]. pr(e) is villain's equity.

    pr(v) = 0.15/(0.4+0.15) = 0.2727...
    pr(b) = 0.4/(0.4+0.15) = 0.7272...
    pr(e)=pr(v)*pr(e|v) + pr(b)*pr(e|b) (where pr(i|j) is the probability of i given j)

    if y <= 0.15, pr(e|v)=pr(h_h<y)*pr((e|v)|h_h<y) + pr(h_h>=y)*pr((e|v)|h_h>=y)
    =y/0.15*0.5 + (1-y/0.15)*1
    =1-3.3333*y
    if y > 0.15, pr(e|v)=pr(h_v>0.15)*pr((e|v)|h_v>0.15) + pr(h_v<=0.15)*pr((e|v)|h_v<=0.15)
    =(y-0.15)/y*0 + (1-(y-0.15)/y)*0.5
    =0.5(1-1+0.15/y)
    =0.5*0.15/y
    =0.075/y
    if y <= 0.6, pr(e|b)=1
    if y > 0.6, pr(e|b)=pr(h_v<0.6)*pr((e|b)|h_v<0.6) + pr(h_v>=0.6)*pr((e|b)|h_v>=0.6)
    =0.6/y*1 + (1-0.6/y) * (pr(h_h<y)*0.5 + pr(h_h>=y)*1)
    =0.6/y*1 + (1-0.6/y) * ((y-0.6)/0.4*0.5 + (1-(y-0.6)/0.4)*1)
    =2.5-1.25y-0.45/y

    Finally, we can calculate villain's EV as:
    EV=y*(pr(e)*2-(1-pr(e))
    EV=y*(3pr(e)-1)

    I could go on an show how to to find the point of max EV by finding the root of the derivative of the above expression, but that's when I realized that there was a much simpler solution. Regardless, I plotted villain's EV relative to y and here it is:


    For 5 and 6, I didn't get it because the definition of unexploitative strategy is not 100% clear in my mind. I thought it was the strategy that prevents the opp from making any +EV play? If so, how do you get to your criteria above?

    edit: OK I think I get it, let me know if this is wrong. Villain's initial strategy is to call only with [0,4] or x_1=0.4 and Hero is exploiting this and playing an optimal strategy against villain, for which Hero's EV of betting y_0(=0.6) is the same as his EV of checking it, and his EV of betting y_1(=0.15) is the same as his EV of checking it. So at this time, villain is being exploited and sure enough, his EV of calling with x_1(0.4) is not the same as his EV of folding it. When he adjusts his x_1 to 0.8185, he is now playing optimally against Hero's strategy and the roles are reversed: the exploiter becomes the exploited, until Hero adjusts again, and so on. Eventually, they should converge to a stable state where all three conditions you listed are met and where as soon as one of them tries to deviate a hair from his strategy, the other one counter adjusts a bit himself, and they both find themselves dragged back like magnets to the equilibrium.

    Now questions: are the equilibria in heads up zero sum games always stable, or are there cases where there are multiple possible equilibria and/or the equilibria are unstable and the players keep bouncing from one to another? Do all these games always converge?
    Last edited by daviddem; 12-11-2010 at 11:49 AM.
    Virginity is like a bubble: one prick and it's all gone
    Ignoranus (n): A person who is stupid AND an assh*le
  2. #2
    spoonitnow's Avatar
    Join Date
    Sep 2005
    Posts
    14,219
    Location
    North Carolina
    Quote Originally Posted by daviddem View Post
    For 5 and 6, I didn't get it because the definition of unexploitative strategy is not 100% clear in my mind. I thought it was the strategy that prevents the opp from making any relatively +EV play? If so, how do you get to your criteria above?
    Imagine both players are playing their own optimal exploitative strategy against each other at the same time. A consequence of this would be that neither player can do better by changing their strategy, because by definition, the optimal exploitative strategy is the best-performing strategy in terms of EV that you can have against a given opponent's strategy.

    So neither player can improve their EV by changing their strategy, and this creates a certain equilibrium. If either player deviates from the equilibrium by changing their play, then they lose EV (since they will be performing worse than the optimal exploitative strategy, which was the maximum EV they could obtain).

    In this equilibrium, neither player can be exploited by a change in strategy. Therefore, their strategies are both unexploitable.

    So what is a change in strategy in this game? Villain has one strategic option, and that's where to place x_1. Hero has two strategic options, and those are where to place y_1 and y_0.

    When Villain is playing his optimal exploitative strategy, the EV of folding x_1 will be the same as the EV of calling x_1. Similarly, when Hero is playing his optimal exploitative strategy, the EV of betting y_1 or y_0 will be the same as the EV of checking y_1 or y_0, respectively.

    So we set the EV of folding x_1 equal to the EV of calling x_1, the EV of betting y_1 equal to the EV of checking y_1, and the EV of betting y_0 equal to the EV of checking y_0. When we do all of this, then both players are playing their respective optimal exploitative strategies, and therefore are both playing unexploitably. The solutions for x_1, y_1, and y_0 in this system of equations gives us each player's unexploitable strategy.

    Quote Originally Posted by daviddem View Post
    (2) in (3):
    y_0=2y_1/(1-a) (4)
    (4) in (1):
    y_1=(1-2y_1/(1-a))/a
    ay_1=1-2y_1/(1-a)
    (a+2/(1-a))y_1=1
    (a(1-a)+2)*y_1=(1-a)

    y_1=(1-a)/(2+a(1-a))


    so by (2):
    x_1=2(1-a)/(2+a(1-a)) (5)

    (5) in (3):
    y_0=2/(2+a(1-a))

    Not sure if there is a way to make them look better...

    ^^Some typos spoon, too many calling and folding, not enough betting and checking.
    Thanks for pointing out the typos. And that's as good as you're going to make them look in text.
    Last edited by spoonitnow; 12-11-2010 at 01:13 PM.
  3. #3
    Quote Originally Posted by spoonitnow View Post
    And that's as good as you're going to make them look in text.
    The bottom terms can be slightly simplified, although it doesn't really look that much prettier in the text.

    2 + a(1-a) = 2 + a - a^2 = -1(a^2 - a - 2) = -1(a-2)(a+1) = (2-a)(a+1)

    y_1 = (1-a)/(2+a(1-a)) = (1-a)/( (2-a)(a+1) )
    x_1 = 2(1-a)/(2+a(1-a)) = 2(1-a)/( (2-a)(a+1) )
    y_0 = 2/(2+a(1-a)) = 2/( (2-a)(a+1) )
  4. #4
    spoonitnow's Avatar
    Join Date
    Sep 2005
    Posts
    14,219
    Location
    North Carolina
    The [0, 1] Half-Street Fixed-Limit Game - Balanced Play vs. Exploitative Play

    This is my official 5000th post, and I feel like what this post illustrates is pretty important, so I hope some people get a lot out of it.

    Given x_1, the optimal exploitative value betting range is going to be [0, y_1] such that y_1 = x_1 / 2. With this value betting range, how often do we have to bluff so that we're unexploitable?

    We'd like the EV of Villain calling with x_1 to be the same as the EV of folding x_1. Earlier we did this and found that the resulting equation was when y_1 = (1 - y_0)/a. We already know what y_1 will be, so let's change this equation around to make it more convenient for finding y_0 given y_1:

    y_1 = (1 - y_0)/a
    ay_1 = (1 - y_0)
    y_0 = 1 - ay_1

    So given x_1, we should be value betting [0, x_1 / 2], and bluffing [1 - (a/2)x_1, 1] if we want Villain's call with x_1 to be break even.

    Earlier we found that the optimal exploitative bluffing frequency was set by y_0 = x_1/(1-a). Now we're looking at the balanced bluffing frequency being set by y_0 = 1 - (a/2)x_1. The relationship between these two values is very important in the discussion of balance vs. exploitative play.

    Let's take a look at an example scenario. Suppose P = 1.5 and Villain calls with [0, 0.3], making x_1 = 0.3. Note that a = 0.4. Then the optimal exploitative value betting frequency is set by y_1 = 0.15, the optimal exploitative bluffing frequency is set by y_0 = 0.3/(1-0.4) = 0.5 and the balanced bluffing frequency is set by y_0 = 1 - (0.4/2)(0.3) = 0.94. If we plot these values on a number line, we can find some interesting results.


    Note: Don't fall into the trap of thinking that all of this is a lot of theoretical stuff that doesn't apply to "real" poker. In no-limit hold'em, this situation would basically be like we had 2/3 pot left behind on the river in position and Villain had us covered, and after Villain checks to us, we have similar ranges. Additionally, this game can be slightly altered so that we don't have the same ranges.

    I've colored the 5 important segments of the [0, 1] distribution in the above graphic. In the bright red, this is the part of Hero's range that he will value bet with, all of which will show a profit. In the dark red, this is the part of Villain's range that he calls with, but that does not beat the worst hand that Hero value bets. This section exists because Hero bluffs a non-zero percentage of the time, and because of the effect of pot odds.

    The blue section is the range of hands that Hero can bluff with if he wants to exploit Villain with an increased bluffing frequency. By bluffing hands from this section, he increases his EV at the risk of being exploited by Villain calling more.

    The green section is the range of hands that Hero should always bluff. If he doesn't bluff these hands, he loses EV. The problem is that there is no risk to bluffing these hands because Villain cannot exploit Hero by calling more. Moreover, if you don't bluff these hands, you become exploitable by Villain calling less. So basically, you have to bluff these hands because 1) it's +EV compared to checking, and 2) if you don't then you become exploitable.
  5. #5
    spoonitnow's Avatar
    Join Date
    Sep 2005
    Posts
    14,219
    Location
    North Carolina
    The [0, 1] Half-Street Fixed-Limit Game - Balanced Play vs. Exploitative Play (cont.)

    In the scenario in the post before this one, P = 1.5 making a = 0.4, and we said Villain called with 30% of his range (x_1 = 0.3). Villain is folding too much, so we have a large range where bluffing is better than checking. But what happens if he's calling too much (like a lot of bad players, especially at the micros, do)?

    Let P = 1.5 again (a = 0.4), but let x_1 = 0.7 now. If we do our calculations, we'll find that y_1 = 0.35, the balanced bluffing range is defined by y_0 = 0.86. However, if you look at the EV equations for checking or bluffing when Hero is dealt the hand 1, you'll see that checking is better. This means that there are no hands where Hero can make a bluff that has a higher EV than checking.

    Our balanced strategy includes a decent amount of bluffs because we are value betting with such a wide range. However, those bluffs are -EV in a vacuum, and work only to protect us from Villain adjusting by calling less (and folding more).

    This means that against people who call too much and will not adjust to you never bluffing, you should probably never bluff since it costs you money without earning you anything.

    Showdown Value and the Size of the Pot

    Conventional wisdom says that the larger the pot, the more valuable your showdown value is, so the wider of a range you should be checking behind on the river.

    To test this, let's use this same scenario with P = 1.5 and x_1 = 0.7 to look at what happens to our checking range when the pot gets bigger. If you notice, the range from y_1 to y_0 is our checking range, so we could quantify the size of this range just by using (y_0 - y_1). Here we're going to let y_0 be the balanced version with y_1 = 0.35. The following are some values for P and (y_0 - y_1) when x_1 = 0.7 and Hero plays a balanced strategy with y_1 = x_1 / 2:

    1.5 0.5100
    1.6 0.5154
    1.7 0.5204
    1.8 0.5250
    1.9 0.5293
    2.0 0.5333
    2.1 0.5371
    2.2 0.5406
    2.3 0.5439
    2.4 0.5471
    2.5 0.5500

    And as expected, when the pot size increases, the size of our checking range increases, gradually approaching 0.65. Note we can only check 65% of hands because we're value betting the other 35%.
    Last edited by spoonitnow; 12-12-2010 at 12:01 AM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •