Originally Posted by
Mario F.
Instead where I am stuck is at algorithm analysis. I need a more concrete analysis of the OP pattern. As I said on post #7 I am fully aware at the loss of uniform distribution quality. The question however remains, how significant is it in the current space.
I hope you realize I didn't intend to lecture you; I just switched to lecture mode.
Yes, that is quite a complicated issue mathematically. The theoretical handling of it is stinky, so let's go the practical, numeric route, shall we?
The following program takes two parameters: the factor (number of uniformly distributed integers generated), and the interval. (number of unique result values).
Code:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
unsigned long values, modulus, i, *count;
char dummy;
if (argc != 3) {
fprintf(stderr, "\nUsage: %s FACTOR INTERVAL\n\n", argv[0]);
return EXIT_FAILURE;
}
if (sscanf(argv[1], " %lu %c", &values, &dummy) != 1 || values < 2UL) {
fprintf(stderr, "%s: Invalid factor.\n", argv[1]);
return EXIT_FAILURE;
}
if (sscanf(argv[2], " %lu %c", &modulus, &dummy) != 1 || modulus < 2UL || modulus > values) {
fprintf(stderr, "%s: Invalid interval.\n", argv[2]);
return EXIT_FAILURE;
}
count = malloc((size_t)modulus * sizeof *count);
if (!count) {
fprintf(stderr, "Not enough memory.\n");
return EXIT_FAILURE;
}
for (i = 0UL; i < modulus; i++)
count[i] = 0UL;
for (i = 0UL; i < values; i++)
count[i % modulus]++;
i = 0UL;
while (i < modulus) {
unsigned long o = i;
while (o + 1UL < modulus && count[o] == count[o+1UL])
o++;
if (o > i)
printf("%lu .. %lu occur %lu/%lu of the time each\n", i, o, count[i], modulus);
else
printf("%lu occurs %lu/%lu of the time\n", i, count[i], modulus);
i = o + 1UL;
}
free(count);
return EXIT_SUCCESS;
}
First, let's look at the smallest factor, 10, and the probabilities of each result for different intervals (assuming prng() produces uniformly distributed nonnegative values less than one):
Code:
Interval 2 3 4 5 6 7 8 9
Result
0 5/10 4/10 3/10 2/10 2/10 2/10 2/10 2/10
1 5/10 3/10 3/10 2/10 2/10 2/10 1/10 1/10
2 3/10 2/10 2/10 2/10 2/10 1/10 1/10
3 2/10 2/10 2/10 1/10 1/10 1/10
4 2/10 1/10 1/10 1/10 1/10
5 1/10 1/10 1/10 1/10
6 1/10 1/10 1/10
7 1/10 1/10
8 1/10
This means that with an interval = 9, your solution provides 0, the lower bound, 20% of the time, whereas every other value occurs only 10% of the time each. That's a pretty bad bias.
For larger factors, the aliasing when interval == factor-1 stays bad. For example, if interval=9999 and factor=10000 then lower bound occurs 2/9999 of the time, versus every other value occurring only 1/9999 of the time each.
Again, none of this aliasing occurs if you just use min + (int)((double)interval * prng()).
What if we scale factor by ten, so that it is always at least ten times the interval?
Code:
Interval 2 3 4 5 6 7 8 9
Result
0 50/100 34/100 25/100 20/100 17/100 15/100 13/100 12/100
1 50/100 33/100 25/100 20/100 17/100 15/100 13/100 11/100
2 33/100 25/100 20/100 17/100 14/100 13/100 11/100
3 25/100 20/100 17/100 14/100 13/100 11/100
4 20/100 16/100 14/100 12/100 11/100
5 16/100 14/100 12/100 11/100
6 14/100 12/100 11/100
7 12/100 11/100
8 11/100
The aliasing drops, so that the difference in probabilities is at most one percent; for example, some values occur at most 1% more often than others. (Relatively speaking, if the interval is 9 and factor 100, then 0 occurs 10% more often than 1, or indeed any of the other values. In absolute terms, 0 occurs 12% of the time, whereas each of the other values occurs only 11% of the time.)
If interval=9999 and factor=100000 then the ten closest integers to the lower bound occur 11/9999 (1.1%) of the time each, whereas the others occur 10/9999 (1.0%) of the time each.
So, as you can see, the aliasing problem tends to be bad when interval (used as the divisor) is close to factor. You can mitigate it by scaling the factor larger, but I just cannot understand why you'd use modulo operation in the first place.
Please do experiment with the above program at different values to explore various factors and intervals. It definitely helps me to grasp this aliasing/folding issue at the intuitive level.
Originally Posted by
Mario F.
Please understand. I'm not very keen on written rules. I always cringe when I'm told I shouldn't do floating point operations, for instance.
Oh, absolutely! That's why I prefer longer posts, giving the readers the tools to examine the issue more closely, rather than giving "rules of thumb" that eventually turn out to be quite misleading.
My point was that if you start with a prng() that produces nonnegative integers up to some 2N-1 inclusive, with all binary digits also having uniform distribution (as is automatically the case if the generator is good, i.e. random, enough), you will have much better tools to provide the higher-level abstractions, like random vectors, or nonuniform distributions.
Originally Posted by
Mario F.
So, instead of just accepting the well known patterns (rng() * interval + min) I'm more interested in exploring new or little utilized patterns and understand their implications. It's not that rng() * interval + min hurts my sensitivities in any way. I'm just exploring other venues.
That is an excellent thing, and I hope I'm not curbing your enthusiasm. It just happens that the modulo operation causes this aliasing issue. I've been kicked in the teeth by it, so I may be a bit too keen on helping others avoid it, too.
The exclusion method I mentioned is way underutilized. I've basically only seen it in physics simulations, where the aliasing issues tend to be annoyingly visible in the results. There are some good articles on it (in numerical computing), but I couldn't find any good references using a quick Ducking/Googling. (Funnily enough, the fourth match on duckduckgo.com was my own post at LinuxQuestions.org.)
It should be used more often, especially when your lower-level generator is fast, say Xorshift.
The exclusion method is quite useful for floating-point prng(), too.
For example, consider the case where you want to generate a random unit vector in 3D. You have two basic approaches: either you use trigonometric functions (basically a polar coordinate system and adjustments to make each direction equally probable), or the exclusion method. The exclusion method ends up being MUCH faster, and is easy to prove to yield uniform vectors:
Code:
typedef struct {
double x;
double y;
double z;
} vec3d;
vec3d prng_vector(const double length)
{
vec3d v;
double vv;
do {
v.x = 2.0 * prng() - 1.0;
v.y = 2.0 * prng() - 1.0;
v.z = 2.0 * prng() - 1.0;
vv = v.x*v.x + v.y*v.y + v.z*v.z;
} while (vv <= 0.25 || vv >= 1.0);
vv = length / sqrt(vv);
v.x *= vv;
v.y *= vv;
v.z *= vv;
return v;
}
The above function generates uniform random 3D vectors in the cube (-1,-1,-1)-(+1,+1,+1), but rejects all outside the unit sphere, and all inside the half-radius sphere (to avoid aliasing/quantization for short vectors). The volume of the cube is 8, and the volume of the accepted shell is 4π/3-π/6 ≃ 3.665, so on average, we need to generate 8/3.665 = 2.183 random vectors (6.548 random numbers) for each result vector.
Even with that overhead, this ends up being much faster than the trigonometric approach.
For 2D, the unit vector generation is simpler, as you can just generate one value between 0 and 2π and use that as the direction; i.e. a = 2.0*3.14159265358979323846*prng(); x = cos(a); y = sin(a);
However, on architectures without hardware trigonometric functions, or slow sin()/cos() implementation, the exclusion method may turn out to be faster.
For my own needs, I tend to write prng_vector() using integer types, only converting to floating-point formats at the very end. On most architectures, it ends up slightly faster (being able to use more than one ALU in superscalar processors like 64-bit AMD and newer Intel processors) than floating-point arithmetic. Plus it's obvious then that the distribution is uniform (given uniform integers).