Thread: Hash Function question

  1. #1
    Registered User
    Join Date
    Aug 2002
    Location
    Hermosa Beach, CA
    Posts
    446

    Hash Function question

    I have been browsing through the book "Introduction to Algorithms", and I was reading the chapter on Hash Tables/Hash Functions. (Ch.12). There are three hash function algorithms described: Division method, Multiplication method, and Universal Hashing.

    Division method is simple, it is just: h(k) = k % m, where m is the maximum elements in the table. It is assumed that the key k is an integer (and if it is string there are methods given for converting string to int). So far, so good: For an implementation that at core uses something like "vector< list<T> > vec", we can always be sure that the value for key will be within bounds of the vector (because of k % vec.size()).

    Now for multiplication method, it gives the function: h(k) = floor( M * ( (k * A) - floor(k * A))), where 0 < A < 1. The book suggests A = (sqrt(5) - 1)/2 as a good choice. M is an integer, and the books says the choice of M is not critical. In it's example it gives M as being 10000.

    Now my question is, this function will return some value for h(k), and that will be my index into my vector<list<T>>. But how do I know that it will be within bounds of the array? Am I supposed to apply the division method after I have applied the multiplication method, to ensure that I have a value that is in the proper range? If so, this is not explained in the text.

    Thanks.
    The crows maintain that a single crow could destroy the heavens. Doubtless this is so. But it proves nothing against the heavens, for the heavens signify simply: the impossibility of crows.

  2. #2
    Registered User
    Join Date
    Jan 2003
    Posts
    311
    The range will be [0..M), you would use size() for M to get your index. x - floor(x) is always in the range [0..1), here square bracket means including and paren means approaching. This is somewhat related to the modulus used in the "division" method. What it is basically doing is shifting part of K below the decimal point and "chopping off" the whole number part, yielding a number in the range [0..1), but without regard to how many buckets you have. Dividing a number by A is the same as multipliying by (A^-1) where ^ is power, not xor. When A is (0..1) this is the same as dividing by (1..inf). M then stretches this back out to the range you want. Note that you do have to be careful because while I can put a nice algebraic "this number approaches 1" your computer may very well decide at some point decide that it's close enough and return M, so you may want to use size()-1;

  3. #3
    Registered User
    Join Date
    Aug 2002
    Location
    Hermosa Beach, CA
    Posts
    446
    Thanks, for the confirmation. Just a mement ago, I realized that M * K, when K is say, .9999, and M is 10000, will yield 9999. So I guess the range should be 0 to M-1.

    Anyway, thanks for the info/clarification.
    The crows maintain that a single crow could destroy the heavens. Doubtless this is so. But it proves nothing against the heavens, for the heavens signify simply: the impossibility of crows.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 4
    Last Post: 05-13-2011, 08:28 AM
  2. Compiling sample DarkGDK Program
    By Phyxashun in forum Game Programming
    Replies: 6
    Last Post: 01-27-2009, 03:07 AM
  3. doubt in c parser coding
    By akshara.sinha in forum C Programming
    Replies: 4
    Last Post: 12-23-2007, 01:49 PM
  4. function pointer question
    By andrea72 in forum C++ Programming
    Replies: 7
    Last Post: 12-11-2007, 06:25 AM
  5. I need help with passing pointers in function calls
    By vien_mti in forum C Programming
    Replies: 3
    Last Post: 04-24-2002, 10:00 AM