Hash Function question
I have been browsing through the book "Introduction to Algorithms", and I was reading the chapter on Hash Tables/Hash Functions. (Ch.12). There are three hash function algorithms described: Division method, Multiplication method, and Universal Hashing.
Division method is simple, it is just: h(k) = k % m, where m is the maximum elements in the table. It is assumed that the key k is an integer (and if it is string there are methods given for converting string to int). So far, so good: For an implementation that at core uses something like "vector< list<T> > vec", we can always be sure that the value for key will be within bounds of the vector (because of k % vec.size()).
Now for multiplication method, it gives the function: h(k) = floor( M * ( (k * A) - floor(k * A))), where 0 < A < 1. The book suggests A = (sqrt(5) - 1)/2 as a good choice. M is an integer, and the books says the choice of M is not critical. In it's example it gives M as being 10000.
Now my question is, this function will return some value for h(k), and that will be my index into my vector<list<T>>. But how do I know that it will be within bounds of the array? Am I supposed to apply the division method after I have applied the multiplication method, to ensure that I have a value that is in the proper range? If so, this is not explained in the text.
The range will be [0..M), you would use size() for M to get your index. x - floor(x) is always in the range [0..1), here square bracket means including and paren means approaching. This is somewhat related to the modulus used in the "division" method. What it is basically doing is shifting part of K below the decimal point and "chopping off" the whole number part, yielding a number in the range [0..1), but without regard to how many buckets you have. Dividing a number by A is the same as multipliying by (A^-1) where ^ is power, not xor. When A is (0..1) this is the same as dividing by (1..inf). M then stretches this back out to the range you want. Note that you do have to be careful because while I can put a nice algebraic "this number approaches 1" your computer may very well decide at some point decide that it's close enough and return M, so you may want to use size()-1;
Thanks, for the confirmation. Just a mement ago, I realized that M * K, when K is say, .9999, and M is 10000, will yield 9999. So I guess the range should be 0 to M-1.
Anyway, thanks for the info/clarification.