Thread: Hash implementation.

  1. #1
    Registered User
    Join Date
    Nov 2005
    Posts
    673

    Hash implementation.

    This is a hash produced by Bob Jenkins, and I just want to know from a more intelligent person than me if my overload method will work in all cases. It seems to be safe, but I would hate to start using it if it may fail under certain circumstances.

    Code:
    #include <string>
    
    /*
    	Hash Author: Bob Jenkins
    */
    
    unsigned Hash(char* pKey, unsigned pLength, unsigned pInit);
    unsigned Hash(unsigned char* pKey, unsigned pLength, unsigned pInit);
    unsigned Hash(const char* pKey, unsigned pInit);
    unsigned Hash(std::string pKey, unsigned pInit);
    
    
    
    unsigned Hash( char* pKey, unsigned pLength, unsigned pInit )
    {
    	return Hash(std::string((const char*)pKey,pLength), pInit);
    }
    
    unsigned Hash( unsigned char* pKey, unsigned pLength, unsigned pInit)
    {
    	return Hash(std::string((const char*)pKey, pLength), pInit);
    }
    
    unsigned Hash( const char *pKey, unsigned pInit )
    {
    	return Hash(std::string(pKey),pInit);
    }
    
    void mix(unsigned& a, unsigned& b, unsigned& c)
    {
    	a -= b; a -= c; a ^= ( c >> 13 );
    	b -= c; b -= a; b ^= ( a << 8 );
    	c -= a; c -= b; c ^= ( b >> 13 );
    	a -= b; a -= c; a ^= ( c >> 12 );
    	b -= c; b -= a; b ^= ( a << 16 );
    	c -= a; c -= b; c ^= ( b >> 5 );
    	a -= b; a -= c; a ^= ( c >> 3 );
    	b -= c; b -= a; b ^= ( a << 10 );
    	c -= a; c -= b; c ^= ( b >> 15 );
    }
    
    unsigned Hash(std::string pKey, unsigned pInit)
    {
    	unsigned a, b;
    	unsigned c = pInit;
    	unsigned len = pKey.length();
    	unsigned loc = 0;
    
    	a = b = 0x9e3779b9;
    	while ( len >= 12 ) {
    		a += ( pKey[loc] + ( (unsigned)pKey[loc] << 8 ) 
    			+ ( (unsigned)pKey[loc] << 16 )
    			+ ( (unsigned)pKey[loc] << 24 ) );
    		b += ( pKey[4] + ( (unsigned)pKey[5] << 8 ) 
    			+ ( (unsigned)pKey[loc] << 16 )
    			+ ( (unsigned)pKey[loc] << 24 ) );
    		c += ( pKey[8] + ( (unsigned)pKey[9] << 8 ) 
    			+ ( (unsigned)pKey[loc] << 16 )
    			+ ( (unsigned)pKey[loc] << 24 ) );
    
    		mix ( a, b, c );
    
    		loc += 12;
    		len -= 12;
    		if ( loc > len )
    			loc = len;
    	}
    
    	c += pKey.length();
    
    	switch ( len ) {
    	case 11: c += ( (unsigned)pKey[10] << 24 );
    	case 10: c += ( (unsigned)pKey[9] << 16 );
    	case 9 : c += ( (unsigned)pKey[8] << 8 );
    		/* First byte of c reserved for length */
    	case 8 : b += ( (unsigned)pKey[7] << 24 );
    	case 7 : b += ( (unsigned)pKey[6] << 16 );
    	case 6 : b += ( (unsigned)pKey[5] << 8 );
    	case 5 : b += pKey[4];
    	case 4 : a += ( (unsigned)pKey[3] << 24 );
    	case 3 : a += ( (unsigned)pKey[2] << 16 );
    	case 2 : a += ( (unsigned)pKey[1] << 8 );
    	case 1 : a += pKey[0];
    	}
    
    	mix ( a, b, c );
    
    	return c;
    }
    Thank you very much for any constructive criticism.
    [/SIZE][/FONT][/SIZE][/FONT]
    Last edited by Raigne; 04-06-2011 at 01:13 PM. Reason: Apparently copy/paste from VS2010 does colors but not formatting

  2. #2
    Registered User
    Join Date
    May 2009
    Posts
    4,183
    No it will not work in all cases.
    If "char" is unsigned it will likely not even compile.

    Tim S.

  3. #3
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    Okay, reason for unsigned char not compiling? It compiles fine and works fine, I just want to know if there is something im missing, and a brief explanation of why that is.

  4. #4
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    He is simply wrong about that part.

    Presumably, he was thinking that if `char' was the same as `unsigned char' you would get an overload problem. (You would if you weren't using `char' as the base type. Feel free to try with `int' instead.)

    This isn't a problem though because `signed char', `unsigned char', and `char' are three different types.

    Anyway, because you are using a simple cast, the overloads and use of `std::string' should work.

    The problem is, your implementation is wrong. Or at least, it will not return the same results as the canonical implementation.

    Soma

  5. #5
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    The implementation of the Hash is wrong?

    Code:
    unsigned char str[10] = "VariaTest";
    	std::cout << "Hash 0 ( VariaTest ) = " << Hash(str, 9, 0) << '\n';
    	std::cout << "Hash 1 ( VariaTest ) = " << Hash("VariaTest", 0) << '\n';
    	std::cout << "Hash 2 ( VariaTest ) = " << Hash(std::string("VariaTest"), 0 ) << '\n';
    Given this input it produces the same value.

  6. #6
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    All of that uses the same implementation. That only serves to say that the overloads do the job of forwarding to the same implementation.

    Soma

  7. #7
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    What is wrong with the implementation, and how to fix it?

    edit: I think I know what the problem was, let me fix it and I will repost.

    Fixed I think:
    Code:
    unsigned Hash(std::string pKey, unsigned pInit)
    {
    	unsigned a, b;
    	unsigned c = pInit;
    	unsigned len = pKey.length();
    	unsigned loc = 0;
    
    	a = b = 0x9e3779b9;
    	while ( len >= 12 ) {
    		a += ( pKey[loc] + ( (unsigned)pKey[loc+1] << 8 ) 
    			+ ( (unsigned)pKey[loc+2] << 16 )
    			+ ( (unsigned)pKey[loc+3] << 24 ) );
    		b += ( pKey[loc+4] + ( (unsigned)pKey[loc+5] << 8 ) 
    			+ ( (unsigned)pKey[loc+6] << 16 )
    			+ ( (unsigned)pKey[loc+7] << 24 ) );
    		c += ( pKey[loc+8] + ( (unsigned)pKey[loc+9] << 8 ) 
    			+ ( (unsigned)pKey[loc+10] << 16 )
    			+ ( (unsigned)pKey[loc+11] << 24 ) );
    
    		mix ( a, b, c );
    
    		loc += 12;
    		len -= 12;
    		if ( len <= 12 )
    		{
    			if ( len < 0 )
    				len = 1;
    			if ( loc > pKey.length()-10 )
    				loc = pKey.length()-10;
    		}
    	}
    
    	c += pKey.length();
    
    	switch ( len ) {
    	case 11: c += ( (unsigned)pKey[loc+10] << 24 );
    	case 10: c += ( (unsigned)pKey[loc+9] << 16 );
    	case 9 : c += ( (unsigned)pKey[loc+8] << 8 );
    		/* First byte of c reserved for length */
    	case 8 : b += ( (unsigned)pKey[loc+7] << 24 );
    	case 7 : b += ( (unsigned)pKey[loc+6] << 16 ); 
    	case 6 : b += ( (unsigned)pKey[loc+5] << 8 );
    	case 5 : b += pKey[loc+4];
    	case 4 : a += ( (unsigned)pKey[loc+3] << 24 );
    	case 3 : a += ( (unsigned)pKey[loc+2] << 16 );
    	case 2 : a += ( (unsigned)pKey[loc+1] << 8 );
    	case 1 : a += pKey[loc];
    	}
    
    	mix ( a, b, c );
    
    	return c;
    }
    Last edited by Raigne; 04-06-2011 at 03:16 PM.

  8. #8
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    Okay. You must be talking about a different hash than I think you are.

    Do you have a link to the actual hash?

    That said, kudos for hearing a reference to a potential problem and trying to fix it on your own. It is always nice to see that.

    Soma

  9. #9
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    Eternally Confuzzled - The Art of Hashing The hash is the last one torwards the bottom of the page. I am just trying to make it work with std::strings for simplicity. I hate managing c strings.

  10. #10
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    Okay. It is the algorithm I was thinking of.

    As a matter of course, what does this code, from your implementation, do?

    Soma

    Code:
    if ( len <= 12 )
    {
    	if ( len < 0 )
    		len = 1;
    	if ( loc > pKey.length()-10 )
    		loc = pKey.length()-10;
    }

  11. #11
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    Make sure that the 'loc' does not go out of bounds.

  12. #12
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Make sure that the 'loc' does not go out of bounds.
    No. That is only the intent. That is not what it does.

    Here, try going stepping through your implementation with strings that have a length of between 11 and 25. (Trying once for each string length.)

    Soma

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Dynamic Hash Table Implementation
    By james123 in forum C Programming
    Replies: 1
    Last Post: 12-30-2009, 11:07 AM
  2. Problem with unaligned intrinsics
    By The Wazaa in forum C++ Programming
    Replies: 4
    Last Post: 02-18-2009, 12:36 PM
  3. Group Project Help/Volunteer
    By DarkDot in forum C++ Programming
    Replies: 3
    Last Post: 04-24-2007, 11:36 PM
  4. Hash Table implementation
    By supaben34 in forum C++ Programming
    Replies: 6
    Last Post: 09-26-2003, 12:48 PM
  5. hash table data structure implementation
    By sanju in forum C Programming
    Replies: 1
    Last Post: 09-27-2002, 05:06 AM