Help understanding hash function

**127.0.0.1** · 06-13-2011

I am looking over code that implements a linked list where one of its members is a string value -- to deal with this they used a hash table to quickly find elements. Although I do understand hash tables I am unable to fully understand the reasoning behind certain sections of code particularly the Com_HashKey.

cvar_t is an element of the global linked list of cvar_t named cvar_vars

Code declaring some size constants and the global variables

Code:

#define CVARS_HASH_SIZE  1024
#define MAX_LIST_CVARS   4096
cvar_t  *cvar_varsHashTable[CVARS_HASH_SIZE];
cvar_t  *cvar_vars;

Code used to initialize / find a cvar_t

Code:

/*
 =================
 Cvar_Get

 If the variable already exists, the value will be set to the latched
 value (if any).
 The flags will be OR'ed in if the variable exists.
 =================
*/
cvar_t *Cvar_Get (const char *name, const char *value, int flags, const char *description){
	cvar_t *cvar;
	unsigned hash;

	cvar = Cvar_FindVariable(name);
	if (cvar){
		// Update latched variables
		...
		// Reset value is always set internally
		...
		// Description is always set internally
		...
		return cvar;
	}
	// Check for command override
	...
	// Check for invalid name
	...
	// Check for invalid value
	...
	// Allocate a new cvar
	...
	// Link the variable in
	cvar->next = cvar_vars;
	cvar_vars = cvar;

	// Add to hash table
	hash = Com_HashKey(name, CVARS_HASH_SIZE);

	cvar->nextHash = cvar_varsHashTable[hash];
	cvar_varsHashTable[hash] = cvar;
	return cvar;
}

The function Com_HashKey which is where most of my confusion stems

Code:

/*
 =================
 Com_HashKey

 Returns a hash value for string
 =================
*/
unsigned Com_HashKey (const char *string, unsigned hashSize){

	unsigned	hash = 0;
	int			i;

	for (i = 0; string[i]; i++)
		hash = (hash + i) * 37 + tolower(string[i]);

	return (hash % hashSize);
}

Code used to find a cvar_t from cvar_vars using the hash table

Code:

/*
 =================
 Cvar_FindVariable
 =================
*/
cvar_t *Cvar_FindVariable (const char *name){
	cvar_t		*cvar;
	unsigned	hash;
	hash = Com_HashKey(name, CVARS_HASH_SIZE);
	for (cvar = cvar_varsHashTable[hash]; cvar; cvar = cvar->nextHash){
		if (!Q_stricmp(cvar->name, name))
			return cvar;
	}
	return NULL;
}

**anduril462** · 06-13-2011

I can't give you the exact reasoning that Com_HashKey is written the way it is. All I can tell you is designing a good hash function is an art best left to people with fancy degrees in math and comp sci. The basic idea is take some data, and run it through a function to produce an index into your hash table. The big thing you want to do is avoid collisions, a collision being two different inputs that produce the same hash key. Collisions make hash tables inefficient.

My guess is that Com_HashKey is a simple hash function they cobbled together because they weren't too worried about performance. The fact that they tolower the string means they want "ABCDE", "Abcde", "abcde", etc to not only have the same hash key, but to actually refer to the exact same element (this is backed up by the use of Q_stricmp in Cvar_FindVariable). The modulus at the end is to make sure your hash key is within bounds for the hash table. I'm guessing they picked 37 as a multiplier because it's a prime. Maybe somebody else can give a better analysis. Otherwise, I think it's best to just take it for what it is -- a quick and dirty hash function.

**iMalc** · 06-14-2011

I agree with the above analysis.
If you are in the position to change this code then I recommend switching to the FNV-1a hash instead. It is very similiar but simpler and better at the same time. You would have no trouble switching to it.

**127.0.0.1** · 06-14-2011

Thanks for the suggestions and analysis

, this is mostly a personal project that I am doing for educational purposes so I would like to create my own that improves on the given code.

Thread: Help understanding hash function

Thread Tools

Search Thread

Display

Help understanding hash function

Similar Threads

Hash function to hash key into geographic coordinate

Help understanding Hash Table

Hash Function

Hash function

about hash function