Thread: double hashing not filling 100% of big tables

  1. #1
    Registered User
    Join Date
    Mar 2009
    Location
    Bozen
    Posts
    95

    double hashing not filling 100% of big tables

    Hello,

    I'm having some problems with filling a hash table using the following hash functions:

    Code:
    int h1(int k, int m){
    	int h1 =  mod(k,m);
    	return h1;
    }
    int h2(int k, int m){
    	int h2probe = mod(2*k+1,m);
    	return h2probe;
    }
    int h(int k,int i, int m){
    	int h1a = h1(k,m);
    	int h2a = h2(k,m);
    
    	int h = mod(h1a + i*h2a, m);
    	return h;
    
    int insert(int H[], int m, int key){
    
    	if (isFull(H, m) == -1) return -1;
    	int i=0;
    	int key2 = key*2+1;
    	if(key2 < 0) key /= 2;
    	int probe = h(key,i,m);			// 0 <= probe < m;
    
    	while (H[probe] != 0)			// we know there is room, so will terminate if the hashing function is able to map at each spot.
    		probe = h(key,++i,m);
    	H[probe] = key;
    
    	return probe;
    }

    The problem is that when I try to fill an 18 slots table slot 14 never gets a change to be filled. Similar deadlocks occur with other sizes. While a table of 10 elements fills correctly. The literature found about double hashing seem not to answer my problem. The insertion code simply tries again increasing i. However deadlocks occur. Any suggestions on how to solve it still using double hashing and the h functions specified above(so only play with the insertion method). Mod(a,m) return the module > 0.

  2. #2
    Registered User
    Join Date
    Mar 2009
    Location
    Bozen
    Posts
    95
    no hashing experts over here?

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    You may want to check that your two hash functions generate an equal number of odd and even hashes - if odd ones are more common, then you may find that you are more often than not ending up with an even number in your sum.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    Registered User
    Join Date
    Mar 2009
    Location
    Bozen
    Posts
    95
    hmm... I'm not clear on the problem (of even and odd numbers) and even more on how to achieve it in code inside the insertion function. I wonder if you could be more specific.

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    If two numbers are odd, the sum of them is always even. Certainly h2() is guaranteed to ALWAYS be odd. Key2 is also guaranteed to be odd in your calculation (unless it overflows into negative range - I don't know what your key value range is, so I can't say if it happens to be overflowing from time to time).

    So if your hashes in each hash function more often generates an odd number than even, you have more likelihood of getting an even number than an odd one.

    Since 18 is not an odd number, it is impossible that mod 18 will give an odd result.

    You could perhaps produce a table of all the results for your hash functions (seeing as there isn't a huge number of them), and the resulting combined hash.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Registered User
    Join Date
    Mar 2009
    Location
    Bozen
    Posts
    95
    Code:
     int key2 = key*2+1;
    	if(key2 < 0) key /= 2;
    This is relevant only for overflow (to guarantee that the key is always positive.

    h2() produces an odd number, but h1() should produce an even number. The case are that the first will be an even number:
    h1 + 0h2
    the second an odd h1 + h2 .
    the 3rd an even h1 + 2h2
    and so on ...

    I don't see why mod 18 should not return odd numbers. Given 19 it should return the odd 1 for example.

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    18 comes from "I try to fill an 18 slots".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Registered User
    Join Date
    Mar 2009
    Location
    Bozen
    Posts
    95
    I'm still not clear. 18 is an upper bound for the mod, so using it how I do should guarantee values in the range up to 18. However 14 doesn't get it since I reach the deadlock situation of h1 returning 8 and h2 4 and counting down. The problem as I said doesn't occur with all table sizes, and in other cases the deadlock occurs before 14 being the last unfilled element in the table. So what can I do to fix the code to always fill all the slots?

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    So, have you analyzed what your hash values are for each key. Adding something like:
    Code:
    printf("key=%d, h1 = %d, %h2=%d, h=%d\n", k, h1a, h2a, h2);
    inside your h function will (I'm guessing, at least) show what is going on, and what values you are getting out of it.

    I'm still pretty sure that the problem is that you are not getting enough variation in your hash keys.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  10. #10
    Registered User
    Join Date
    Mar 2009
    Location
    Bozen
    Posts
    95
    I've already seen the problem and the values through the debugger. I found out: 14 doesn't get it since I reach the deadlock situation of h1 returning 8 and h2 4 and counting down. The problem as I said doesn't occur with all table sizes, and in other cases the deadlock occurs before 14 being the last unfilled element in the table. So what can I do to fix the code to always fill all the slots?

    Now the problem is that how can I tweak the code to make it go over that 12 when it reaches it, not modifying h1 and h2.

  11. #11
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Did you really just ask how you can make your hashing work right without changing your hashing functions? Run your hash in a loop, feeding it consecutive incrementing numbers, filling an array of X size, where each time it hits that spot, you increment the value in the array. If you find some cells ending up blank, then you need to tune your hashing functions.

    You either change the number of buckets, or you change the way you're filling them. That's how hashing works.


    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. For the numerical recipes in C types!
    By Smattacus in forum C Programming
    Replies: 5
    Last Post: 10-28-2008, 07:57 PM
  2. Conversion From C++ To C
    By dicon in forum C++ Programming
    Replies: 2
    Last Post: 06-10-2007, 02:54 PM
  3. what's the difference?
    By iluvmyafboys in forum C++ Programming
    Replies: 13
    Last Post: 02-28-2002, 09:25 PM
  4. error message, what does it mean?
    By Shy_girl_311 in forum C++ Programming
    Replies: 5
    Last Post: 11-09-2001, 09:54 PM
  5. Hi, could someone help me with arrays?
    By goodn in forum C Programming
    Replies: 20
    Last Post: 10-18-2001, 09:48 AM