Let's say I have a 6-sided dice and I want to accurately estimate the probability of throwing a six by using a model.
How do I determine how many runs I should do on my model to reach a 1% margin of error?
Printable View
Let's say I have a 6-sided dice and I want to accurately estimate the probability of throwing a six by using a model.
How do I determine how many runs I should do on my model to reach a 1% margin of error?
It's a bit hard to explain on here since it involves a bit of math, but here is my best attempt:
First of all, you can never guarantee that you will be within 1%. The best you can do is say that it will be within 1% 50 or 95 percent of the time.
I am assuming that you are running a program that chooses a random number 1-6 and counts the number of times 6 is chosen. So I am guessing your code looks something like this:
Now we don't know N yet, but let's assume that it is large (at least a couple hundred). Then we can say "probability" is within 1% if n is between 0.99*N/6 and 1.01*N/6 (since N/6 is the "right" answer).Code:for (int i=0; i<N; ++i)
{
//choose a number 1-6
if (number==6) ++n;
}
double probability=(double)n/N;
Calculate the probability that it is within this range using the Gaussian distribution and the central limit theorem where mu=Np and sigma^2=Np(1-p) (p here is 1/6).
If you looked up the integral on wiki and did some algebra, you should have an expression involving the error function of N and a bunch of numbers. Set this equal to 0.5 or .95 or whatever number you need and solve for N. (You'll need a calculator that can do inverse error function. Wolfram is good for that.)
Hopefully, this is enough to give you a push in the right direction. If you still need help, let me know and I'll explain it in more detail. It's just that on this forum, it's such a pain to try to describe complex equations.
s / sqrt(n) where s is the sample error (5/6) and n is the number of samples
soThat is just the probability that a given sample does not represent the population as a whole. Any given sample of that many may in fact be wildly inaccurate. This technique is often used by lobbyists to push their agenda using false surveys. Take a thousand surveys of 1000 people, and pick the one that represents your viewpoint best and claim it gives a standard error of 0.5 / sqrt(1000) or ~1.5%Code:#include <stdio.h>
int main(){
double s = 5.0/6.0;
unsigned long long n = 1;
while(s / sqrt(n) > 0.01) n++;
printf("%d samples needed.\n" , n);
return 0;
}
Awesome material, you two!
Thanks a bunch. I'll try and take it from here.