Thread: Help understanding Hash Table

  1. #1
    Registered User
    Join Date
    Dec 2009
    Location
    UK
    Posts
    30

    Help understanding Hash Table

    Hi, i am having trouble understanding the implementation of a hash table. I've been on quite a few links off google and still can't get my head round it!
    My project needs to store all the words from a text file in a hash table (using linked lists to deal with collisions), when an identical word is inserted it increments the frequency value of the list element instead of making a new entry in the linked list.
    First off, here is what i have:

    structs to deal with table and lists.
    Code:
    typedef struct HL {
    	char string[20] ;
    	int frequency ; //to deal with identical words inserted
    	struct HL *next ;
    } Hash_list ;
    
    typedef struct HT {
        int size;
        Hash_list **table;
    } Hash_table;
    called somewhere in main with
    Code:
    Hash_table table = createTable() ;
    :
    Code:
    Hash_table *createTable( void ) {
    	Hash_table *hashTable = calloc(1 , sizeof(Hash_table)) ;
    	hashTable->size = 8 ;
    	hashTable->table = (Hash_list**)malloc(hashTable->size * sizeof(Hash_list)) ;
    	int i ;
    	for(i = 0 ; i < hashTable->size ; i++)
    		hashTable->table[i] = (Hash_list*)malloc(sizeof(Hash_list)) ;
    	return hashTable ;
    }
    Then code for inserting word (in a loop):
    Code:
    fscanf(f, "%s" , word) ;
    		format(word) ; //removes special characters and capitalises every character
    		table = addTableElement(table , word) ;
    Code:
    /* Add new element to hash table */
    Hash_table *addTableElement(Hash_table *old , char *elem){
    	int h = hash(elem) ; //get hash key of string (currently returns 0 for testing)
    	old->table[h] = addHashElement(old->table[h] , elem); //add it to correct list
    	return old ; //return edited hash table
    }
    Code:
    /* Add new element to Hash_list */
    Hash_list *addHashElement(Hash_list *old , char *elem){
    	if(old == NULL){
    		return makeHashList(elem , NULL) ; //if no entries in this list yet, create using makeHashList
    	}else{
    		Hash_list *new ;
    		for(new = old ; new->next != NULL ; new = new->next) {
    			if(strcmp(new->string , elem) == 0){//if it finds the string already in hash list
    				new->frequency++ ; //add one to the frequency value
    				while(new->next != NULL)//continue iterating through list
    					new = new->next ;
    				return old ;	//then return the list
    			}
    		}
    		new->next = makeHashList(elem , NULL) ;//if the string is not in the list, make a new entry in the list
    		return old ;//then return it
    	}
    }
    Code:
    /* Create element in Hash_list */
    Hash_list *makeHashList (char *string , Hash_list *next){
    	Hash_list *new = calloc(1 , sizeof(Hash_list)) ;
    	strncpy(new->string , string , 20) ;
    	new->next = next ;
    	new->frequency = 1 ;
    	return new ;
    }
    Ok, sorry about all that, i've tried to give you the bare minimum code! At the moment the program crashes at the line:
    old->table[h] = addHashElement(old->table[h] , elem);
    in function addTableElement(Hash_table *old , char *elem).

    I think it is because of the way i constructed the hash table in function createTable() , but i'm not sure!

    please, if anyone can help that would be great!
    -Ed

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Code:
    	for(i = 0 ; i < hashTable->size ; i++)
    		hashTable->table[i] = (Hash_list*)malloc(sizeof(Hash_list)) ;
    It seems to me that each table[i] is uninitialized (because you've used malloc() here, and not calloc() or memset() or anything).
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #3
    Registered User
    Join Date
    Dec 2009
    Location
    UK
    Posts
    30
    Thank you, calloc made it all better! Now i have a slightly more annoying problem:
    I have started making a real hash function now, instead of just returning 0 it now adds up each character in the string like so:

    Code:
    /* Hash function */
    int hash(Hash_table *hashTable , char *elem){
    	int i ;
    	int x = 0 ;
    	char c = elem[0];
    	for(i = 0 ; c != '\0' ; i++){
    		x = x + c ;
    		c = elem[i+1] ;
    	}
    	return x ;
    }
    i initially had my hash table size set to 8, now i am getting keys way above that, to deal with that i am trying to resize the hash table like this:

    Code:
    Hash_table *addTableElement(Hash_table *old , char *elem){
    	int h = hash(old , elem) ; //get hash key of string
    	if(old->size < h){
    		//printf("%s" , elem ) ;
    		int oldSize = old->size - 1;
    		old->size = h ;
    		old->table = (Hash_list**)realloc(old->table , sizeof(Hash_list)*h) ;
    		int i ;
    		for(i = oldSize ; i < old->size ; i++){
    			old->table[i] = (Hash_list*)calloc(1 , sizeof(Hash_list)) ;
    		}
    	} 
    ...... deal with nicely resized table....
    return old ;
    }
    I know its wrong as its effecting code further in the program, but i'm not sure whats causing it! Am i at least on the right lines?

    thanks!
    Last edited by boblettoj99; 12-27-2009 at 01:10 PM.

  4. #4
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Quote Originally Posted by boblettoj99 View Post
    Thank you, calloc made it all better! Now i have a slightly more annoying problem:
    I have started making a real hash function now, instead of just returning 0 it now adds up each character in the string like so:

    Code:
    /* Hash function */
    int hash(Hash_table *hashTable , char *elem){
    	int i ;
    	int x = 0 ;
    	char c = elem[0];
    	for(i = 0 ; c != '\0' ; i++){
    		x = x + c ;
    		c = elem[i+1] ;
    	}
    	return x ;
    }
    That works, but it hurt my brain a little, unnecessarily. What's wrong with
    Code:
    int hash(Hash_table *hashTable , char *elem) {
        int sum = 0, i;
        for(i = 0; elem[i]; i ++) {
            sum += elem[i];
        }
        return sum;
    }
    or even this? (If you happen to like pointers.)
    Code:
    int hash(Hash_table *hashTable , char *elem) {
        int sum = 0;
        while(*elem) {
            sum += *elem;
            elem ++;
        }
        return sum;
    }
    i initially had my hash table size set to 8, now i am getting keys way above that, to deal with that i am trying to resize the hash table ...
    Usually in a hash table the key is modded by the size of the table; so you'd find the correct index into the table by using something like
    Code:
    hash(table, data) % table->size
    You really shouldn't be resizing the table just because you have a key that's larger than it.

    Then, since you're using a hash table with chaining, if multiple keys map to the same location in the table, you just look through the list of elements there. Read Wikipedia or something if you want to know more about this . . . .
    Last edited by dwks; 12-29-2009 at 01:49 PM. Reason: my slash key isn't working very well
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Dictionary in C# to hash table in C?
    By dinoman in forum C Programming
    Replies: 2
    Last Post: 04-12-2009, 09:23 PM
  2. Writing array, to file
    By zootreeves in forum C Programming
    Replies: 9
    Last Post: 09-08-2007, 05:06 PM
  3. Group Project Help/Volunteer
    By DarkDot in forum C++ Programming
    Replies: 3
    Last Post: 04-24-2007, 11:36 PM
  4. Hash table creation and insertion
    By tgshah in forum C Programming
    Replies: 1
    Last Post: 01-23-2006, 07:54 PM
  5. Not sure on hash table memory allocations
    By Thumper333 in forum C Programming
    Replies: 3
    Last Post: 09-27-2004, 09:00 PM