Thread: Preserving RNG distribution on int cast

  1. #1
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446

    Preserving RNG distribution on int cast

    Assuming random() is a RNG with uniform distribution returning a float in the range [0.0, 1.0[, the following is my attempt to generalize a min/max range of integer output values:

    Code:
    interval = max - min + 1
    factor = 10 ^ floor(log10(interval) + 1)
    result = int(random() * factor % interval) + min
    The question I have is if `factor` is altering the distribution's uniformity of the outcome and, if so, whether multiplying it by a factor of 10 will help.
    Last edited by Mario F.; 04-16-2015 at 08:28 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  2. #2
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    The question I have is if `factor` is altering the distribution's uniformity of the outcome and, if so, whether multiplying it by a factor of 10 will help.
    O_o

    A change in uniformity is the least of your worries.

    Code:
    max = 100
    min = 0
    interval = max - min + 1 # 101
    factor = pow(10,floor(log10(interval) + 1)) # 1000
    result = int(0.0 * factor % interval) + min # 0
    result = int(1.0 * factor % interval) + min # 91
    You've been working with Python so I'm going to suggest the `random.randrange(min, max)' function for now.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  3. #3
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Quote Originally Posted by phantomotap View Post
    O_o

    A change in uniformity is the least of your worries.

    Code:
    max = 100
    min = 0
    interval = max - min + 1 # 101
    factor = pow(10,floor(log10(interval) + 1)) # 1000
    result = int(0.0 * factor % interval) + min # 0
    result = int(1.0 * factor % interval) + min # 91
    That result is expected, no?. The output isn't a direct map of the float progression. Or am I missing something deeper here?

    You've been working with Python so I'm going to suggest the `random.randrange(min, max)' function for now.
    ... or randint(). But this isn't about Python, although I'm using it for testing purposes. It's instead for a class I'm preparing for my math students. But in any case, here is a Python implementation. You'll see all results get generated with what appears to be a uniform distribution.

    Code:
    import math
    from collections import defaultdict
    from random import random 
    
    
    def rand(min_, max_, samples=1):
        if not samples:
            raise ValueError('Invalid number of samples.')
        if min_ > max_:
            raise ValueError('Interval error. Max > Min.')
    
        interval = max_ - min_ + 1
        factor = 10 ** math.floor(math.log10(interval) + 1)
    
        if samples > 1:
            results = defaultdict(int)
            for _ in range(samples):
                results[int(random() * factor % interval) + min_] += 1
            return sorted(results.items())
        else:
            return int(random() * factor % interval) + min_
    A quick test case:

    Code:
    for value, amount in rand(7, 23, 10000):
        print(value, ':', amount)
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  4. #4
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    That result is expected, no?. The output isn't a direct map of the float progression. Or am I missing something deeper here?
    O_o

    You'll have to forgive me. I'm not great with mathematics so necessarily not very good at explaining mathematics.

    Happily, you provided an example.

    Code:
    for value, amount in rand(1, 3, 10000):
        print(value, ':', amount)
    I think the generated range is the problem, but I lack the mathematics to explain why I think the near limit (The 91 value I called attention to in my post.) of the generated range is wrong.

    *shrug*

    I'll just stay over here until one of the mathematicians shows.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  5. #5
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Quote Originally Posted by phantomotap View Post
    Happily, you provided an example.

    Code:
    for value, amount in rand(1, 3, 10000):
        print(value, ':', amount)
    Ah! Good catch. That produced a result way off.

    I felt something was missing, because earlier I had come a with similar result, but left the resolution for later; and when the time came, I couldn't remember anymore how to reproduce. All my tests were no longer revealing why I needed another factor of 10. Hence why I ended up making this thread.

    That is solved by increasing the factor variable by a factor of 10. This will increase the integer portion of the dividend and produce a more accurate mod result.

    You helped

    EDIT: For future reference, the fix is then:

    Code:
    factor = 10 ** (math.floor(math.log10(interval) + 1) + 1)
    or perhaps more readable:

    Code:
    factor = 10 ** math.floor(math.log10(interval) + 1) * 10
    Last edited by Mario F.; 04-16-2015 at 10:23 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  6. #6
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    That is solved by increasing the factor variable by a factor of 10. This will increase the integer portion of the dividend and produce a more accurate mod result.
    O_o

    I understand that perfectly well, yet I still couldn't explain why I thought the generated near upper limit felt off.

    or perhaps more readable:
    Code:
    factor = 10 ** math.floor(math.log10(interval) + 2)
    I think, in context, the line makes for not just a more readable statement but a more documented statement.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  7. #7
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Quote Originally Posted by phantomotap View Post
    I understand that perfectly well, yet I still couldn't explain why I thought the generated near upper limit felt off.
    Maybe the modulus that caused the confusion? It will partition the result space and the input upper limit will not always reflect the output upper limit.

    EDIT:

    The initial code is also misleading. I think that is probably why.

    result = int(random() * factor % interval) + min

    can be better expressed

    result = int(random() * factor) % interval + min

    EDIT 2:
    I always kept in mind that there is in fact a slight loss to the initial distribution properties of the random() function. Mathematically, the above is sound (which is the part that interests me at this point). But with the limitations of fp numbers, random() will only return from a subset of the Reals. The partitions from the modulus operation aren't going to be equally sized because, at least with the libraries I'm using, there's no concept of infinite precision.

    That said, the loss is negligible. Or at least that's my understanding. I would also like to hear from others.
    Last edited by Mario F.; 04-16-2015 at 11:51 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  8. #8
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Maybe the modulus that caused the confusion? It will partition the result space and the input upper limit will not always reflect the output upper limit.
    O_o

    The slight difference in the generated limit is only a side effect of the code I wrote around your original code to examine the conditions of the predicted bias.

    My code generated strictly one less, for the 1.0 value, than your code for any given range while exhibiting less obvious bias for particular ranges.

    My code also has a canceling, as in raising a number by zero, bug for certain ranges.

    I honestly just assumed, incorrectly as it happens, that the approach you used which doesn't have such a canceling bug was wonky in that the range was slightly different than intended.

    [Edit]
    I only just realized that I'm overloading the "range" word.

    I do realize that the generated values do not correspond with the inputs as some of my use of "range" might suggest.

    I have occasionally been using "range" in the context of function domain and range instead of the interval context.
    [/Edit]

    My complaint (#6) was really more that I can't explain why my code should have less bias than yours, especially after the explanation you provided for your solution, because my version uses values of the same magnitude as your original code.

    *shrug*

    If I can't figure anything out on my own, I might ask for help in a bit.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  9. #9
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    >_<

    I'm an idiot.

    The altered code I wrote around your code to watch the conditions for the predicted bias has the same problem as your original code.

    [Edit]
    An explanation for the altered code:

    My harness is designed to explore different distribution methods using the same sample data. I think resetting the generator is the wrong approach not to mention a waste of memory and/or time spent on writing files to store the intermediate results for large sets so I use callbacks to request the same random number from within each distribution method and only record deltas for the results. To really benefit from the technique, you need to process many different methods. The original post predicts bias so I used several variations of the almost the same code.
    [/Edit]

    I only managed to scatter the bias across more values; the larger the range the more obvious the bias.

    Oddly enough, my solution for watching puny values was scaling.

    *shrug*

    Sadly, the reason I couldn't explain the source of the apparent differences is because such a differences do not exist.

    I was simply looking at insufficient data.

    Disregarding my approach to testing corner cases, the results were just a coincidence. ;_;

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  10. #10
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Lecture mode!

    Aliasing occurs when you fold a larger range of integer values to a smaller range using modulus or scaling. It leads to some values occurring more often than others, even if the original distribution was uniform. Your example suffers from this.

    For example, if you have prng() that provides an integer between 0 and 15, inclusive, and you use prng() % 9 to generate an integer between 0 and 8, inclusive, you'll see 7 and 8 half as often as you see the other values:
    Code:
    0    1    2    3    4    5    6    7   8  │ results   
    ──┼────┼────┼────┼────┼────┼────┼────┼────┼───────────────
    0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │ 8  │ prng() % 9
    9 │ 10 │ 11 │ 12 │ 13 │ 14 │ 15 │    │    │
    If you used (prng()*9)/16, then 4 and 8 would occur half as often as the other values:
    Code:
    0    1    2    3    4    5    6    7    8 │ results   
    ──┼────┼────┼────┼────┼────┼────┼────┼────┼───────────────
    0 │  2 │  4 │  6 │  8 │  9 │ 11 │ 13 │ 15 │ prng()*9/16
    1 │  3 │  5 │  7 │    │ 10 │ 12 │ 14 │    │
    The closer the number of output values is to the number of values the prng() generates, the worse the aliasing effect is.

    This is often ignored because most use cases involve small ranges, and the prng() output ranges tend to be huge.

    For example, if you used prng() % 1000 where prng() yields integers between 0 and 2147483647, inclusive, the aliasing effect is under 0.00005%. In other words, some of the thousand possible result values occur 2147483/2147484 times as often as the others. Very small difference, and therefore easy to ignore.

    To avoid aliasing, the exclusion method is often used.

    The idea is simple: you just discard any prng() values that are outside your desired range. This retains the uniform distribution. However, if the desired range is small compared to the range of values prng() produces, a lot of generated numbers end up being excluded, wasting a lot of time. If prng() is good enough, its digits (binary digits in particular) will also have an uniform probability distribution, and we can consider only its lowest bits. This means that we will generate at most two numbers for each result.

    For example,
    Code:
    int prng_range(const int min, const int max)
    {
        unsigned int limit = max - min;
        unsigned int mask, value;
    
        /* mask = 2**(floor(log2(limit))+1)-1. */
        mask = limit;
        mask |= mask >> 1;
        mask |= mask >> 2;
        mask |= mask >> 4;
        mask |= mask >> 8;
        mask |= mask >> 16;
        if (CHAR_BIT * sizeof mask > 32) mask |= mask >> 32;
        if (CHAR_BIT * sizeof mask > 64) mask |= mask >> 64;
    
        /* Exclusion method: */
        do {
            value = prng() & mask;
        } while (value > limit);
    
        return min + value;
    }
    The above uses a commonly known bit fiddling trick to calculate the mask: it should have all binary digits 1 up to and including the higest digit in limit.

    Aliasing also occurs on floating-point values, but to a lesser degree. (I do believe this is often called quantization error, as it is more intuitive/descriptive of the problem.)

    Most computers nowadays use IEEE-754 binary64 (double) or binary32 (float) floating-point types. These have the property that they can represent as many distinct values between 0.5 and 1.0, as they can between 0.25 and 0.5, or between 0.125 and 0.25, or between 0.0625 and 0.125, and so on. Note: the distribution of the values stays uniform between 0 and 1, it is just that there are more unique values representable closer to zero. This rarely matters in practice, and is typically only observable in some rare cases, when generating pseudorandom numbers to a non-uniform probability distribution. In particular, when multiplied and converted to an integer, the density of distinct values does not matter -- the distribution is uniform, and so is the distribution of the resulting integers, too.

    When generating pseudorandom numbers to a logarithmic distribution, it actually helps, yielding more distinct result values, without distorting the distribution. So it's not a clear-cut bad thing.

    One can avoid this aliasing by using at most as many bits as the floating-point type has (logically) in its mantissa. This results in uniform density of distinct values over the range 0 to 1.

    In general, it is a good idea to start with a pseudorandom number generator yielding uniformly distributed unsigned integers, with all binary digits having uniform distribution. (I'd say anything that passes the diehard test should be fine. I personally love Xorshift, and have used Mersenne Twister extensively.)

    Functions that provide uniform random integers or floating-point values within a specified range, should use the pseudorandom number generator directly, and avoid aliasing effects where those effects are unwanted.

    If, however, you only have a prng() that returns an uniformly distributed value at least zero, but less than 1, you should just use
    Code:
    /* Range is inclusive, both min and max are returned */
    int prng_irange(const int min, const int max)
    {
        return min + (int)(((double)max - (double)min + 1) * prng());
    }
    as the power-of-ten-followed-by-modulus just causes additional aliasing. Yes, this way will have some aliasing, too, but we cannot avoid it without access to, or at least details of, the internals of prng(). Which is why you should have an unsigned-integer-based prng() in the first place.
    Last edited by Nominal Animal; 04-17-2015 at 07:33 AM.

  11. #11
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    I appreciate the lecture, Nominal. But I'm afraid it's just not adding to what I already know about RNGs

    Instead where I am stuck is at algorithm analysis. I need a more concrete analysis of the OP pattern. As I said on post #7 I am fully aware at the loss of uniform distribution quality. The question however remains, how significant is it in the current space. I believe it is not significant. But it's mostly a hunch and based only on the mathematical validity, which in the case of pseudo RNGs and floating point limitations as very little weight.

    While the folding being done by the OP pattern is obvious, you must realize it is being done uniformly across the range of generated values with only the last mod partition being greatly affected. Meanwhile the mod partitions being generated are a great deal many more than the interval. I'm confident this weakens the general idea of loss of uniformity. But I'm not sure

    I'd appreciate help in being taught or directed for the methodologies to test it or, if that's not much a bother, an actual exposition of how big the weakness is.

    For all purposes, let's assume an MT core with 53-bit precision floats and a period around 2 ** 199700-1.

    EDIT: Please understand. I'm not very keen on written rules. I always cringe when I'm told I shouldn't do floating point operations, for instance. Think of the money! It's like being told because taking a dump on the street is bad, I shouldn't take a dump in the middle of the forest where no one is looking and no harm is done. So, instead of just accepting the well known patterns (rng() * interval + min) I'm more interested in exploring new or little utilized patterns and understand their implications. It's not that rng() * interval + min hurts my sensitivities in any way. I'm just exploring other venues.
    Last edited by Mario F.; 04-17-2015 at 08:52 AM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  12. #12
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by Mario F. View Post
    Instead where I am stuck is at algorithm analysis. I need a more concrete analysis of the OP pattern. As I said on post #7 I am fully aware at the loss of uniform distribution quality. The question however remains, how significant is it in the current space.
    I hope you realize I didn't intend to lecture you; I just switched to lecture mode.

    Yes, that is quite a complicated issue mathematically. The theoretical handling of it is stinky, so let's go the practical, numeric route, shall we?

    The following program takes two parameters: the factor (number of uniformly distributed integers generated), and the interval. (number of unique result values).
    Code:
    #include <stdlib.h>
    #include <stdio.h>
    
    int main(int argc, char *argv[])
    {
        unsigned long values, modulus, i, *count;
        char dummy;
    
        if (argc != 3) {
            fprintf(stderr, "\nUsage: %s FACTOR INTERVAL\n\n", argv[0]);
            return EXIT_FAILURE;
        }
    
        if (sscanf(argv[1], " %lu %c", &values, &dummy) != 1 || values < 2UL) {
            fprintf(stderr, "%s: Invalid factor.\n", argv[1]);
            return EXIT_FAILURE;
        }
        if (sscanf(argv[2], " %lu %c", &modulus, &dummy) != 1 || modulus < 2UL || modulus > values) {
            fprintf(stderr, "%s: Invalid interval.\n", argv[2]);
            return EXIT_FAILURE;
        }
    
        count = malloc((size_t)modulus * sizeof *count);
        if (!count) {
            fprintf(stderr, "Not enough memory.\n");
            return EXIT_FAILURE;
        }
    
        for (i = 0UL; i < modulus; i++)
            count[i] = 0UL;
    
        for (i = 0UL; i < values; i++)
            count[i % modulus]++;
    
        i = 0UL;
        while (i < modulus) {
            unsigned long o = i;
    
            while (o + 1UL < modulus && count[o] == count[o+1UL])
                o++;
    
            if (o > i)
                printf("%lu .. %lu occur %lu/%lu of the time each\n", i, o, count[i], modulus);
            else
                printf("%lu occurs %lu/%lu of the time\n", i, count[i], modulus);
    
            i = o + 1UL;
        }
    
        free(count);
        return EXIT_SUCCESS;
    }
    First, let's look at the smallest factor, 10, and the probabilities of each result for different intervals (assuming prng() produces uniformly distributed nonnegative values less than one):
    Code:
    Interval    2     3     4     5     6     7     8     9
      Result
          0    5/10  4/10  3/10  2/10  2/10  2/10  2/10  2/10
          1    5/10  3/10  3/10  2/10  2/10  2/10  1/10  1/10
          2          3/10  2/10  2/10  2/10  2/10  1/10  1/10
          3                2/10  2/10  2/10  1/10  1/10  1/10
          4                      2/10  1/10  1/10  1/10  1/10
          5                            1/10  1/10  1/10  1/10
          6                                  1/10  1/10  1/10
          7                                        1/10  1/10
          8                                              1/10
    This means that with an interval = 9, your solution provides 0, the lower bound, 20% of the time, whereas every other value occurs only 10% of the time each. That's a pretty bad bias.

    For larger factors, the aliasing when interval == factor-1 stays bad. For example, if interval=9999 and factor=10000 then lower bound occurs 2/9999 of the time, versus every other value occurring only 1/9999 of the time each.

    Again, none of this aliasing occurs if you just use min + (int)((double)interval * prng()).

    What if we scale factor by ten, so that it is always at least ten times the interval?
    Code:
    Interval   2         3         4         5         6         7         8         9
    Result
        0    50/100    34/100    25/100    20/100    17/100    15/100    13/100    12/100
        1    50/100    33/100    25/100    20/100    17/100    15/100    13/100    11/100
        2              33/100    25/100    20/100    17/100    14/100    13/100    11/100
        3                        25/100    20/100    17/100    14/100    13/100    11/100
        4                                  20/100    16/100    14/100    12/100    11/100
        5                                            16/100    14/100    12/100    11/100
        6                                                      14/100    12/100    11/100
        7                                                                12/100    11/100
        8                                                                          11/100
    The aliasing drops, so that the difference in probabilities is at most one percent; for example, some values occur at most 1% more often than others. (Relatively speaking, if the interval is 9 and factor 100, then 0 occurs 10% more often than 1, or indeed any of the other values. In absolute terms, 0 occurs 12% of the time, whereas each of the other values occurs only 11% of the time.)

    If interval=9999 and factor=100000 then the ten closest integers to the lower bound occur 11/9999 (1.1%) of the time each, whereas the others occur 10/9999 (1.0%) of the time each.

    So, as you can see, the aliasing problem tends to be bad when interval (used as the divisor) is close to factor. You can mitigate it by scaling the factor larger, but I just cannot understand why you'd use modulo operation in the first place.

    Please do experiment with the above program at different values to explore various factors and intervals. It definitely helps me to grasp this aliasing/folding issue at the intuitive level.

    Quote Originally Posted by Mario F. View Post
    Please understand. I'm not very keen on written rules. I always cringe when I'm told I shouldn't do floating point operations, for instance.
    Oh, absolutely! That's why I prefer longer posts, giving the readers the tools to examine the issue more closely, rather than giving "rules of thumb" that eventually turn out to be quite misleading.

    My point was that if you start with a prng() that produces nonnegative integers up to some 2N-1 inclusive, with all binary digits also having uniform distribution (as is automatically the case if the generator is good, i.e. random, enough), you will have much better tools to provide the higher-level abstractions, like random vectors, or nonuniform distributions.

    Quote Originally Posted by Mario F. View Post
    So, instead of just accepting the well known patterns (rng() * interval + min) I'm more interested in exploring new or little utilized patterns and understand their implications. It's not that rng() * interval + min hurts my sensitivities in any way. I'm just exploring other venues.
    That is an excellent thing, and I hope I'm not curbing your enthusiasm. It just happens that the modulo operation causes this aliasing issue. I've been kicked in the teeth by it, so I may be a bit too keen on helping others avoid it, too.

    The exclusion method I mentioned is way underutilized. I've basically only seen it in physics simulations, where the aliasing issues tend to be annoyingly visible in the results. There are some good articles on it (in numerical computing), but I couldn't find any good references using a quick Ducking/Googling. (Funnily enough, the fourth match on duckduckgo.com was my own post at LinuxQuestions.org.)

    It should be used more often, especially when your lower-level generator is fast, say Xorshift.

    The exclusion method is quite useful for floating-point prng(), too.

    For example, consider the case where you want to generate a random unit vector in 3D. You have two basic approaches: either you use trigonometric functions (basically a polar coordinate system and adjustments to make each direction equally probable), or the exclusion method. The exclusion method ends up being MUCH faster, and is easy to prove to yield uniform vectors:
    Code:
    typedef struct {
        double x;
        double y;
        double z;
    } vec3d;
    
    vec3d prng_vector(const double length)
    {
        vec3d v;
        double vv;
    
        do {
            v.x = 2.0 * prng() - 1.0;
            v.y = 2.0 * prng() - 1.0;
            v.z = 2.0 * prng() - 1.0;
            vv = v.x*v.x + v.y*v.y + v.z*v.z;
        } while (vv <= 0.25 || vv >= 1.0);
    
        vv = length / sqrt(vv);
        v.x *= vv;
        v.y *= vv;
        v.z *= vv;
    
        return v;
    }
    The above function generates uniform random 3D vectors in the cube (-1,-1,-1)-(+1,+1,+1), but rejects all outside the unit sphere, and all inside the half-radius sphere (to avoid aliasing/quantization for short vectors). The volume of the cube is 8, and the volume of the accepted shell is 4π/3-π/6 ≃ 3.665, so on average, we need to generate 8/3.665 = 2.183 random vectors (6.548 random numbers) for each result vector.

    Even with that overhead, this ends up being much faster than the trigonometric approach.

    For 2D, the unit vector generation is simpler, as you can just generate one value between 0 and 2π and use that as the direction; i.e. a = 2.0*3.14159265358979323846*prng(); x = cos(a); y = sin(a);
    However, on architectures without hardware trigonometric functions, or slow sin()/cos() implementation, the exclusion method may turn out to be faster.

    For my own needs, I tend to write prng_vector() using integer types, only converting to floating-point formats at the very end. On most architectures, it ends up slightly faster (being able to use more than one ALU in superscalar processors like 64-bit AMD and newer Intel processors) than floating-point arithmetic. Plus it's obvious then that the distribution is uniform (given uniform integers).

  13. #13
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    How is your exclusion method different from nrand() from "Accelerated C++"?
    https://groups.google.com/forum/#!ms...A/WevBqpEv7QEJ

    Code:
    // return a random integer in the range [0, n)
    int nrand(int n)
    {
        if (n <= 0 || n > RAND_MAX)
            return -1; // C++: throw domain_error("Argument to nrand is out of range");
    
        const int bucket_size = RAND_MAX / n;
        int r;
    
        do r = rand() / bucket_size;
        while (r >= n);
    
        return r;
    }
    gg

  14. #14
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    I will carefully digest, as it deserves, your post later tonight or tomorrow morning. Right now I'm about to start a late meeting and don't know if it will extend beyond dinner.

    But for now just a quick correction to your assumptions about the OP. You are showing a huge bias which is definitely not the case in my code. I wouldn't be showing up for debate a pattern if it was so weak that in an single-digit interval you were already cleaning the floor with it

    For every interval, the factor is always pow(10, number of digits of the interval + 1). So, for an interval of 9, the factor is 100, for an interval of 99, the factor is 1000, and so on. You should look at the discussion up to post #5, where that was established.

    EDIT:
    For the purpose of this discussion, also notice that the factor appears to be key to retain an appreciable portion of uniform distribution properties of the source prng(). So I would wager that increasing the factor by another factor of 10 would further the algorithm strength, and another 10, even more. Without any considerations to performance, up until the floating point limitations of the core generator.
    Last edited by Mario F.; 04-17-2015 at 12:14 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  15. #15
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by Codeplug View Post
    How is your exclusion method different from nrand() from "Accelerated C++"?
    Mine avoids the division (which tends to be slow compared to the other basic arithmetic and binary operators).

    nrand(), which is new to me, avoids relying on the uniformity of the low-order digits, and instead uses the largest multiple of the interval that does not exceed RAND_MAX as the included set. To get the result, you divide the source pseudorandom number by that multiple. Only the very highest pseudorandom values, those that would have folded over the smallest values had modulo operation been used, causing the aliasing effects, are excluded.

    Overall, I think I like nrand() better, as low-order binary digits of many pseudorandom number implementations are not as random as the higher digits are, and nrand() avoids that issue. It is more robust, in other words.

    However, quick microbenchmarking with a Xorshift128+ generator (but using only 31 low bits of each generated 64-bit word) indicates that on my Intel Core i5-4200U nrand() takes about five times as many CPU cycles on average, compared to the exclusion method, on ranges varying from 2 to 16384. The test was crude (repurposed some existing code) and I don't fully trust it, but it does look like my exclusion method is at least twice as fast on average when used with a fast PRNG like Xorshift128+.

    I can tell you that if the interval size does not vary often, you should split it into a separate operation -- perhaps saving it in a separate structure? You just need to precalculate n*bucket_size to compare the prng() to, avoiding the division in the loop altogether in nrand(). This way you only need to do one division per generated number. Similarly, for my exclusion method, the mask can similarly be precalculated, although the savings are relatively modest compared to the division case.)

    PRNG speed/efficiency does matter for certain types of simulations, like Ising models and Monte Carlo methods, where the PRNG is often the bottleneck (as the operations done on the individual random numbers are simple). For normal applications, the difference between the speed of nrand() and my exclusion method is lost in the noise, and robustness of nrand() beats insignificant speed increase.

    If the core PRNG is much slower, then the difference is lost in the noise, and nrand() is the obvious choice (because it is more robust wrt. PRNG low-order digit uniformity).

    Quote Originally Posted by Mario F. View Post
    You are showing a huge bias which is definitely not the case in my code.
    Oh, right. The larger aliasing errors are shown for the case factor = 10**ceil(log10(interval)), but you specified factor = 10**ceil(log10(interval) + 1) (or the equivalent-except-at-exact-powers-of-ten 10**(floor(log10(interval)) + 2)).

    The aliasing errors for your particular case, for small ranges, are shown in the chart following the line "What if we scale factor by ten, so that it is always at least ten times the interval?"
    In other words, the aliasing error is in the 1% range in the worst case.

    Quote Originally Posted by Mario F. View Post
    So I would wager that increasing the factor by another factor of 10 would further the algorithm strength, and another 10, even more.
    Yes, it would, (to about 0.1% and 0.01%, respectively), and you can calculate the aliasing error for the just-under-a-power-of-ten interval case, as that has always the worst case aliasing error.

    What I fail to see, is what benefit there is for the extra operations (power of ten and modulus)?
    I don't like it when code does work "just because"; I'd prefer there to be a reason.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 7
    Last Post: 10-23-2009, 11:54 AM
  2. preserving beyond a function
    By Aisthesis in forum C++ Programming
    Replies: 7
    Last Post: 09-04-2009, 12:35 AM
  3. [C] Sorting an array and preserving the original index
    By frodo_jedi in forum C Programming
    Replies: 10
    Last Post: 04-06-2009, 06:51 AM
  4. Preserving Windoze date while installing linux
    By windoze victim in forum Tech Board
    Replies: 10
    Last Post: 04-02-2003, 10:17 PM
  5. Best Distribution
    By gnu-ehacks in forum Linux Programming
    Replies: 4
    Last Post: 11-21-2001, 03:59 AM