# Thread: random number with lognormal distribution?

1. ## random number with lognormal distribution?

it would be really nice if someone could point me to a fast-running function that could pick a random number with a lognormal probability.

the idea i have would work (it doesn't have to be terribly precise), but would probably be pretty slow. (i'm guessing around 100 calculations on average.)

first, i'd evaluate the distribution on several evenly-spaced intervals to get a trapezoidal estimate of the area of each segment, assign each segment an interval between 0 and 1 proportional to its length, and choose a random float with (float) rand()/RAND_MAZ. that would select the segment...i'd be more likely to land near the middle than in the tail. then i would choose another random float within that segment.

thanks
carrie

2. Use a polynomial fitting algorithm for the inverse CDF. For a random evenly distributed over (0,1), it will generate a random var that follows the distribution.

I don't know where to get a log normal inverse CDF -- I have a 7th order polynomial approximation for the inverse CDF of the standard normal. I dunno if you can get the inverse CDF you need from this.

Otherwise, the way to do it is to generate a large set of points from the inverse CDF you want to approximate (I assume the PDF is given, you do numerical integration to get the CDF and invert the inputs/outputs for the inverse CDF). Then, try fitting various orders of polynomials. Some might overfit, so you need to be careful if you want to have accuracy. You should probably train (fit the polynomial) to something like 2/3 of the data points, and test it on the other 1/3. There are a number of ways you can do this, too.

What is the PDF of the function you want? I may try this (I'm in a course in Neural Networking, function approximation is a major part of this course so I have many tools for this kind of thing. Might be interesting to try.).

3. ha ha, i'm lame. there is always an easier way to do it! okay, the definition of a lognormal distribution is that the log of the variable is normally distributed. i assume that means this:

f(x)=(1/sqrt(2*pi*sigma^2))*exp(-(log(x)-mu)^2/(2*sigma^2))

all i have to do is find a _normally_ distributed number and take its exponential.

>Use a polynomial fitting algorithm for the inverse CDF. For a >random evenly distributed over (0,1), it will generate a random >var that follows the distribution.

what's a cdf? continuous distribution function? i'm assuming that the term "inverse cdf" means it takes a distribution and finds random numbers with that probability, as opposed to parsing a probability from a set of random numbers.

>I don't know where to get a log normal inverse CDF -- I have a >7th order polynomial approximation for the inverse CDF of the >standard normal. I dunno if you can get the inverse CDF you >need from this.

how fast would something like this run? this function will be called at least 600,000 times per reconstruction (1-1000 reconstructions per run; user decides). i would love to have a look at that function though.

>Otherwise, the way to do it is to generate a large set of points >from the inverse CDF you want to approximate (I assume the >PDF is given, you do numerical integration to get the CDF and >invert the inputs/outputs for the inverse CDF). Then, try fitting >various orders of polynomials. Some might overfit, so you need >to be careful if you want to have accuracy. You should probably >train (fit the polynomial) to something like 2/3 of the data >points, and test it on the other 1/3. There are a number of ways >you can do this, too.

okay, the software i'm writing is a simulated annealing algorithm, which means i change a parameter (i'm fitting gaussians but that has nothing to do with the distribution) at random, check the fit, and accept it based on a probability. i need the lognormal distribution to scale some of the random parameters. originally we wrote the software in mathematica, which has Random[LogNormalDistribution[mu,sigma]] built in, and while i'd like the port to follow it faithfully, i can't justify huge computations.

>What is the PDF of the function you want? I may try this (I'm in >a course in Neural Networking, function approximation is a >major part of this course so I have many tools for this kind of >thing. Might be interesting to try.).

what's a pdf?

thanks!
carrie

4. A CDF is a cumulative distribution function, a pdf is the probability density function. The gaussian PDF is the "bell curve" we all know of.

Here's how to generate a random gaussian (normally distributed) variable -- You can modify this as necessary.

mean is the mean, and stdev the standard deviation you want for the distribution. You should seed the random number generator exactly once, before calling the functions.

I'm just gonna attach one of my source files for some projects I've done. The bizarre way I randomize is an attempt to get a very uniform distribution over (0,1) (just dividing rand() by RAND_MAX isn't random enough). The randomization is fast enough, although it's not really ideal as there ARE better ways to do it.

gauss2.cpp has a few functions (gauss2.h only has the prototypes so I won't include it):

double gauss(double mean, double stdev)
This will return a random variable which follows a gaussian distribution with the given mean and standard deviation. This is probably what you want. Be sure to seed the random number generator (again, seed it only once) before you call this the first time.

double probZUT(double z)
Returns the probability that a standard normal random variable will take on a value > z (the area in the upper tail of the standard normal curve).

double probZLT(double z)
Returns the probability that a standard normal random variable will take on a value < z.

double pDevGauss(double x, double mean, double stdev)
Returns the probability that a random variable sampled from a normal distribution with the given mean and standard deviation would deviate from the mean by at least as much as x deviates; this is useful for 2-sided confidence interval z-tests.

The other functions are utility functions used. One implements the inverse CDF of the standard normal, the other is a random double generator that randomizes to DOUBDIGS (default = 20) decimal digits. You can play with this, to get more speed you can reduce this, but your variable won't be "as normal" as it will for larger values of DOUBDIGS. This number should be high, so that the doubles returned (which are all discrete values, like any numbers in a computer) can better approximate a continuous distribution.

5. Forgot the file...