Thread: Probability estimation

  1. #1
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446

    Probability estimation

    Let's say I have a 6-sided dice and I want to accurately estimate the probability of throwing a six by using a model.

    How do I determine how many runs I should do on my model to reach a 1% margin of error?
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  2. #2
    Registered User NeonBlack's Avatar
    Join Date
    Nov 2007
    Posts
    431
    It's a bit hard to explain on here since it involves a bit of math, but here is my best attempt:
    First of all, you can never guarantee that you will be within 1%. The best you can do is say that it will be within 1% 50 or 95 percent of the time.
    I am assuming that you are running a program that chooses a random number 1-6 and counts the number of times 6 is chosen. So I am guessing your code looks something like this:
    Code:
    for (int i=0; i<N; ++i)
    {
        //choose a number 1-6
        if (number==6) ++n;
    }
    double probability=(double)n/N;
    Now we don't know N yet, but let's assume that it is large (at least a couple hundred). Then we can say "probability" is within 1% if n is between 0.99*N/6 and 1.01*N/6 (since N/6 is the "right" answer).

    Calculate the probability that it is within this range using the Gaussian distribution and the central limit theorem where mu=Np and sigma^2=Np(1-p) (p here is 1/6).

    If you looked up the integral on wiki and did some algebra, you should have an expression involving the error function of N and a bunch of numbers. Set this equal to 0.5 or .95 or whatever number you need and solve for N. (You'll need a calculator that can do inverse error function. Wolfram is good for that.)

    Hopefully, this is enough to give you a push in the right direction. If you still need help, let me know and I'll explain it in more detail. It's just that on this forum, it's such a pain to try to describe complex equations.
    I copied it from the last program in which I passed a parameter, which would have been pre-1989 I guess. - esbo

  3. #3
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    s / sqrt(n) where s is the sample error (5/6) and n is the number of samples

    so
    Code:
    #include <stdio.h>
    
    int main(){
       double s = 5.0/6.0;
       unsigned long long n = 1;
    
       while(s / sqrt(n) > 0.01) n++;
    
       printf("%d samples needed.\n" , n);
    
       return 0;
       }
    That is just the probability that a given sample does not represent the population as a whole. Any given sample of that many may in fact be wildly inaccurate. This technique is often used by lobbyists to push their agenda using false surveys. Take a thousand surveys of 1000 people, and pick the one that represents your viewpoint best and claim it gives a standard error of 0.5 / sqrt(1000) or ~1.5%
    Last edited by abachler; 09-17-2009 at 06:16 PM.

  4. #4
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Awesome material, you two!
    Thanks a bunch. I'll try and take it from here.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. drand48() Probability
    By BENCHMARKMAN in forum C Programming
    Replies: 11
    Last Post: 02-18-2008, 08:57 PM
  2. keeping stats - probability
    By -JM in forum C++ Programming
    Replies: 6
    Last Post: 08-20-2005, 09:52 AM
  3. 0% probability == impossible?
    By Silvercord in forum A Brief History of Cprogramming.com
    Replies: 49
    Last Post: 09-01-2003, 03:48 PM
  4. Assigning probability
    By dalek in forum C++ Programming
    Replies: 3
    Last Post: 08-17-2003, 08:26 PM
  5. area estimation of graph
    By hei in forum C Programming
    Replies: 3
    Last Post: 10-16-2001, 12:23 AM