Select Page
Poker Forum
Over 1,291,000 Posts!
Poker ForumBeginners Circle

Why 10k hands before stats mean anything? (long and technical)

Results 1 to 20 of 20
  1. #1
    MadMojoMonkey's Avatar
    Join Date
    Apr 2012
    Posts
    10,322
    Location
    St Louis, MO

    Default Why 10k hands before stats mean anything? (long and technical)

    There is no firm number but here's a thought on statistics: (Yeah I know. Well I'm warning you now that what I'm about to say is mostly useless and will not make you a better poker player, so feel free leave now if you don't want to hear my rationalization.)

    When we talk about a VPIP/PFR (or any stat), we are making an estimate of that frequency. We don't know the EXACT value of what that stat is, since it's constantly being updated by new information. Since we aren't 100% sure, we have some uncertainty. Like the VPIP is 20% +/- 2%. Easy enough.

    Mathematicians have provided us with a whole field of study devoted to statistics and estimating them and figuring out how much confidence we can have in them.

    Confidence Interval (CI): The range of values around the average which we are some % certain that the "actual" value will be after much more data is collected. It is tough, but the language goes, "We are x% sure that the answer is y% +/- z%."

    I will choose x% to be19 out of 20 or 95%, which means, "If I do what I just did 20 more times, I expect that only ONCE will a result fall outside my uncertainty."

    Why not choose 100%, and have certainty? In order to do that, our error range is infinite... The only thing we know for certain is that it IS a number. Why not narrow our error range down to +/-0%? Then we have 0% confidence in our estimate; we have no measure of how right or wrong it might be. So we HAVE to make a trade off to get both a non-zero CI and a non-infinite error range. (Try not to be distracted by the Heisenberg Principle here. We're talking about statistics in general, not the position and velocity of an electron.)

    Now, a 95% CI requires about 5 successes to yield a percent that is equal in uncertainty to its value (more for very small percentages). *note: This is true for frequencies less than 50%. For frequencies greater than 50%, the uncertainty after 5 successes is 1-average% (I hate to say trust me, but trust me on this one.)

    That is: if you do something 100 times and get a "successful result" 5 of those times, then your estimate of the frequency of a success is 5% +/- 5%. If you do something 20 times and are successful 4 times, then your frequency of success is 25% +/- 25%. If you flip a coin 10 times and get heads 5 times, then you know ABSOLUTELY NOTHING about whether that is a fair coin (50% +/- 50% = 0% - 100%). OK, so we understand that there is some minimal sample size that we need to have meaningful results.

    What happens to our error range when we get more than 5 results? I'm not getting into the specifics of that here. I'll just say that the more data we have, the smaller our range of error becomes; the more robust our results.

    1st point) If you are tracking a stat like CR (check-raise), and your villain has only had 6 times where they checked and it wasn't checked around, the stat of 17% is TOTALLY MEANINGLESS. It means that one time in 6 the villain has check-raised. One time is NOT enough to build a frequency!! The error range to within 95% CI includes negative numbers!! That's clearly not acceptable, as you can not have a negative chance of success.

    OK, so why 10k hands? Here goes:
    2nd point)
    There are 169 significantly different starting hands. @ FR, there are 9 positions. It is not unreasonable to think of Holdem as an opportunity to play 169*9 = 1,521 different "games", each one has a specific pocket (with unspecific suits) and a specific position. In order to evaluate our success, we need to have played each game at least 5 times. 1,521*5 = 7,605 total hands before we can START to build a picture of our overall profitability. Remember this is the minimum possible sample that doesn't have negative frequencies in our error ranges.

    Now, it's reasonable to think that not every pocket was played from every position exactly 5 times in our sample, huh? So even this "minimum" threshold of data is far too limited.
    So let's double the number of hands we play from each position, to give ourselves a more stable data set. 169*9*10 = 15,210
    What about the 6-max tables? 169*6*10 = 10,141

    Tada!! There's the ~10k hands figure!!!

    3rd point) This is still a small sample set, as we'd really prefer to have 60+ "successes" for whatever frequency we're estimating. It is important to note that up until now we're assuming that our ability to estimate stats is based solely on our pre-flop holdings and position. A sample of 60+ per pocket per position starts to account for all the different flops.

    FR: 169*9*60 = 91,260
    6-max: 169*6*60 = 60,840

    So after this many hands, you can see a reasonable picture of how you play each hand from each position on a variety of flops. *note: we still haven't taken # of villains in the hand into account.
  2. #2
    !Luck's Avatar
    Join Date
    Feb 2004
    Posts
    1,876
    Location
    Under a bridge
    KISS
  3. #3
    sorry that i just skimmed through it, but you're talking about finding n for a confidence interval for p for your own stats to analyze how you play?
    Last edited by Imthenewfish; 04-15-2012 at 01:36 AM.
  4. #4
    Join Date
    Dec 2009
    Posts
    1,441
    Location
    IRC, Come join me!
    @ point 1, if villain c/r'd me 1/6 times I am 100% confident his c/r % is going to be non zero regardless of what statistics tells me.
  5. #5
    Join Date
    Aug 2007
    Posts
    8,697
    Location
    soaking up ethanol, moving on up
    Quote Originally Posted by MadMojoMonkey View Post
    Yeah I know. Well I'm warning you now that what I'm about to say is mostly useless and will not make you a better poker player, so feel free leave now if you don't want to hear my rationalization.
    I frequently put all of my chips in the middle based on stats gained from 30 hands or fewer. I don't really care that my estimates of true x are not as accurate as they could be.

    i almost left after this ^, but i couldn't help but skim your post after i typed my initial response.

    re your claim that someone's c-raise frequency could be negative after you have seen it happen. This is obviously ridiculous, and you don't understand that there are a number of types of statistical distributions. This reads like someone who encountered omg STANDARD DEVIATION WOW in high school and hasn't really progressed much from there

    cool that you're posting though, i guess.

    p.s. yep, it's been a while since i did graduate maths or statistics, so maybe my understanding is rusty and dated.
  6. #6
    MadMojoMonkey's Avatar
    Join Date
    Apr 2012
    Posts
    10,322
    Location
    St Louis, MO
    @ !luck: I warned you at the top that it was long and technical and not directly about poker. I didn't know how much to illustrate the terms and I thought at least a couple of examples would be helpful.

    @ Imthenewfish: I was trying to find a rational way to get that 10k number to show up. Many new players don't understand why the long term is so long.

    @ Icanhastreebet: Yes, absolutely! BUT that's not the question we're asking.
    That question is: Can this stat be 0%?
    Our question is: What range of values for this stat would I expect to get if I collected this same sample size again?

    @ daven: There is NEVER the chance that a frequency can be less than 0% or greater than 100%. If the error range of the CI for a stat includes numbers which are "not allowed", that indicates that not enough data has been collected, and our estimate of the "actual" stat is not reliable.

    My main point here is to illustrate that a small sample size gives unreliable data and there IS a way to determine how big a sample size needs to be before the data becomes reliable. The number 10k hands gets thrown around a lot and I was trying to show a mathematical justification for why.
  7. #7
    MadMojoMonkey's Avatar
    Join Date
    Apr 2012
    Posts
    10,322
    Location
    St Louis, MO
    Quote Originally Posted by daven View Post
    This is obviously ridiculous, and you don't understand that there are a number of types of statistical distributions. This reads like someone who encountered omg STANDARD DEVIATION WOW in high school and hasn't really progressed much from there.

    I'm using a standard Geometric distribution based on Bernoulli trials (i.e. true/false scenarios where each one is unrelated to past or future events) to estimate any given stat. This means that all I know about a stat is how many times it could have happened and whether it did happen when it could have.

    I'm using a student's T-dist for the CI's on stats with less than 20 trials and a normal distribution for CI's on stats with more than 20 trials (although that's not apparent in my OP).

    I have a degree in physics and a minor in mathematics, so If I'm not satisfying your sense of fullness, please just ask a question. I may not understand the stats as well as my professors, but I've certainly used them in lab reports for the last 4 years.

    I'm not trying to "sound smart" or anything. I'm just thinking about something that seemed arbitrary (10k hands to build a stat profile) and trying to find if there is logic behind it.
  8. #8
    supa's Avatar
    Join Date
    Feb 2010
    Posts
    3,529
    Location
    At the bar drinking whisky with an "e"
    Sorry but I've read almost none of this thread because maths and hangovers don't mix well with me. I think this will help with your thought process tho.

    http://www.flopturnriver.com/pokerfo...h-175213.html?
    “Right thoughts produce right actions and right actions produce work which will be a material reflection for others to see of the serenity at the center of it all”

    Put hero on a goddamn range part II- The 6max years

    Quote Originally Posted by d0zer View Post
    start using your brain more and vagina less

    Quote Originally Posted by kingnat View Post
    Members who's signature is a humorous quote about his/herself made by someone who is considered a notable member of the FTR community to give themselves a sense of belonging.
  9. #9
    MadMojoMonkey's Avatar
    Join Date
    Apr 2012
    Posts
    10,322
    Location
    St Louis, MO
    Yes! Thank you for that link. This is definitely along the lines I was thinking.

    Stats like VPIP and PFR converge quickly because the opportunity to do both happens every hand. However, 3-bet% takes longer to be meaningful, since it can't necessarily happen on every hand. Tying to pin down 4-bet% on the flop can take an immense number of hands. However, all this pertains to reading a villain.

    The 10k hands is supposed to be the number that represents our overall profitability. At least it's meant to give us a somewhat variance-resistant sample size, which we need to justify a deep examination of our entire strategy. Otherwise we end up over-correcting things that may not need correcting at all.

    P.S.: My citation of the Geometric distribution above was completely wrong. The Bernoulli trials stuff is all I'm using. I've linked the 2 in my head due to my own prior uses.
  10. #10
    Join Date
    Aug 2007
    Posts
    8,697
    Location
    soaking up ethanol, moving on up
    Quote Originally Posted by MadMojoMonkey View Post

    I'm using a standard Geometric distribution
    ? a little knowledge is a dangerous thing
  11. #11
    Quote Originally Posted by MadMojoMonkey View Post

    (10k hands to build a stat profile) and trying to find if there is logic behind it.
    Will knowing this help us play poker better? Srs question
    [20:19] <Zill4> god
    [20:19] <Zill4> u guys
    [20:19] <Zill4> so fking hopeless
    [20:19] <Zill4> and dumb
  12. #12
    MadMojoMonkey's Avatar
    Join Date
    Apr 2012
    Posts
    10,322
    Location
    St Louis, MO
    This 10k number gets tossed about a bit. I was just thinking of what number I might use myself. The more I think about poker and don't (solely) take people's word for it, the better I become. I assume this is how we all become great. The little picture session analyses are important. I'm sharing some of my thought process on when I should take a big picture, away-from-the-table look at my play.

    Knowing this will help me understand why it's important to look at a sample of ~10k hands or more before I draw major conclusions about my overall strengths and leaks. I hope that knowing this will keep me from drawing conclusions prematurely.

    I'm sharing because I hope it helps someone else, too. If it doesn't, well. I'm sorry I wasted your time, but I did warn you at the top.
  13. #13
    Yeah I agree with not taking people's word for it part...I just think that the 10k thing is a bit like the old poker adage of position in poker being like water: you don't need to know why it's good for you, it just is. There are other things you should probably spend time thinking about.

    Enough mathematically savvy players will have bandied that number around for us to trust that it's somewhat reliable, or certainly enough for bustostakes.
    [20:19] <Zill4> god
    [20:19] <Zill4> u guys
    [20:19] <Zill4> so fking hopeless
    [20:19] <Zill4> and dumb
  14. #14
    supa's Avatar
    Join Date
    Feb 2010
    Posts
    3,529
    Location
    At the bar drinking whisky with an "e"
    Read the OP (kinda, still mad hungover) and I don't think this is a waste of time and I do think it's worth understanding why old adages are what they are. Knowing why position is good for you is a billion times better than just knowing that it is.

    Things that aren't being considered here are relevant reads outside of stats. And sometimes we don't even need reads other than stats to make solid decisions with a low # of hands on villain.

    If we have say 200 or so hands on villain who has an 85% fold to 3bet, I'm 3betting the crap outta him until he adjusts or otherwise proves that # to be incorrect.

    If he has a low 3bet but we've seen him flat JJ,QQ we can assume he's pretty fucking strong when he 3bets us, at least until he proves otherwise. (it's just an example so don't everybody crawl up my ass about the details)

    I'm not saying you're wrong because there is a lot of merit to what you're saying and as it's been pointed out already it's been said before by greater players than us, but I don't think we really need a huge sample to make solid decisions about most things. We just need to be smart.

    How often do we get 10k hands on villains at the micros anyway?
    “Right thoughts produce right actions and right actions produce work which will be a material reflection for others to see of the serenity at the center of it all”

    Put hero on a goddamn range part II- The 6max years

    Quote Originally Posted by d0zer View Post
    start using your brain more and vagina less

    Quote Originally Posted by kingnat View Post
    Members who's signature is a humorous quote about his/herself made by someone who is considered a notable member of the FTR community to give themselves a sense of belonging.
  15. #15
    rpm's Avatar
    Join Date
    Jul 2009
    Posts
    3,084
    Location
    maaaaaaaaaaate
    i've never had >3k hands tracked vs any opponent i've ever played against. but then i've never been a particularly high volume player (~130k hands played last year, for a sense of perspective)
  16. #16
    !Luck's Avatar
    Join Date
    Feb 2004
    Posts
    1,876
    Location
    Under a bridge
    Arg man. I know what your trying to do but to help a true beginner you don't even need 10k hands. A sample of 100 is prob fine, since the leaks are huge. the chances that a 40/10 player is actually 13/11 player is fairly unlikely, some math wiz can calculate this.

    Not to mention all statistics on beginners are bit moot, since beginners tend to improve dramatically in there first 10k hands so the first 2k hands have almost no relevance on the 2k hands at the end.
  17. #17
    Hi

    regarding what you said about confidence intervals, how after only a small number of actions are observed that these can extend to the negative side of the number spectrum etc etc

    I think you are confused. "Check-raise" is an action, one of several potential actions a player may take, and of course also relies on conditional probabilities (i.e., one may be more likely to check raise in positions where they may have strong holdings, and they may have strong holdings more often when they raise preflop, for example).

    That aside, being that check-raising is one of a number of potential options and that we are concerned with the likelihood of it occurring, the statement that "the confidence interval extends into negative numbers" is absolutely ridiculous, and is indicative of nothing. Stop trying to interpret such things in a linear manner. We are dealing with probabilities; think of a normal or [ exp(.) ] / [ 1 + exp(.) ] i.e. logistic distribution instead. These will always fit probabilities between 0 and 1.

    In conclusion, you are misspecifying the problem at hand.

    Also I should mention I skimmed most of your post etc etc, no offense. and thanks for contributing anyway.

    EDIT: Oh yeah I do agree with the basic premise that larger sample sizes are needed etc. But I don't have much problem with people making inferences based off small-sample properties either, like when sitting with a group of opponents for less than an hour and having to decide whether a given player 3-bets too often or whatever.
    Last edited by Penneywize; 04-17-2012 at 09:25 PM.
  18. #18
    Since my Maniac Math post got linked above (thanks, btw), and since I'm back playin' pokerz, I'll say a few words.

    Estimation of proportions (like PFR and 3b) using statistics IS more difficult when the proportions are small (5% or less) or large (95% or more). The problem is that typical z-test methods that estimate the binomial distribution do work, but they need sample sizes of 200+ to become reliable. Estimating Villain's VPiP of, say, 20 - 25% requires sample sizes of 50 or so to be reliable.

    I also want to caution folks about this "you need 10K HH's" for evaluation of your leaks to be meaningful. There are a couple of problems with this.

    First, we have tilted sessions, great sessions and medium-good sessions. We don't always play JJ the same way in the same spots. A lot depends on opponents. There's enough variation in our game play that the "leaks" we find even after 10K hands may be relics of some hour of tilted-manic-donk-off poker.

    Take stats for what they're worth: general reads that, when combined with one another, can point to areas of our game where we can improve. How do we improve? We filter out some hands from similar spots and work through them by posting HH's, analyzing them ourselves, and PUTTING OUR OPPONENTS ON A GYAW-DANG RANGE!!

    Use the stats to look for POTENTIAL leaks (at any point that you have 3K or 4K in your sample). But don't marry the leaks. Keep looking at HH's and analyzing your specific decision-making until you find a trend of poor play at the table.
  19. #19
    supa's Avatar
    Join Date
    Feb 2010
    Posts
    3,529
    Location
    At the bar drinking whisky with an "e"
    Yay Robb!
    “Right thoughts produce right actions and right actions produce work which will be a material reflection for others to see of the serenity at the center of it all”

    Put hero on a goddamn range part II- The 6max years

    Quote Originally Posted by d0zer View Post
    start using your brain more and vagina less

    Quote Originally Posted by kingnat View Post
    Members who's signature is a humorous quote about his/herself made by someone who is considered a notable member of the FTR community to give themselves a sense of belonging.
  20. #20
    Even at 10k there's just so many factors skewing the distribution but it's as good a calling point as any

    As for villians, you have to start building a profile from hand #1 and a single check raise is never meaningless.
    Congratulations, you've won your dick's weight in sweets! Decode the message in the above post to find out how to claim your tic-tac

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •