Thread: Hash Table with Chaining

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    16

    Hash Table with Chaining

    I'm attempting to write a program for class that takes a dictionary file and reads it into a hash table. Then it will prompt the user for their file that they wish to spell check against the dictionary file. It will read that file word by word, and then spit out the words that were not found in the hash table as spelling errors, giving the word that was misspelled, an the line number it was on.

    I currently have the file scanner setup so that it eliminates odd characters, and splits up each line into multiple strings. I've never written a hash table, so I'm a bit lost on how to even begin this project.

    Here's basically what I understand.

    A hash table is an array (obviously). To figure out where to place each word, I'm going to apply the formula given by the assignment specifications, "hashFunction(w) = (c0*73^(0 mod 4) +c1*73^(1 % 4)+&+cn*73n % 4 ) % [TableLength]"

    So I understand in theory that if there's a collision, I'll add a node to a linked list on the front of my chain stored in the corresponding array spot. What I don't understand however, is how to implement the code.

    Finally, to keep the has function running quickly, I'll start with an array size of 128. When I get 128 words in the array will need to double in size, and when I get to 256, it would double, and then 512, etc.

    Right now I've got it implemented, but without the separate chaining. If anyone could help me get that part, I'd appreciate it.
    Last edited by osxd00d; 12-03-2007 at 09:46 PM.

  2. #2
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    Make the array element a pointer to the head of the list, then you have a list anchored at each array position. You can google for linked lists to get an idea of how to implement them.

  3. #3
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    Quote Originally Posted by Perspective View Post
    Make the array element a pointer to the head of the list, then you have a list anchored at each array position. You can google for linked lists to get an idea of how to implement them.
    What I don't get though, is how do you dynamically create a linked list? My experience has been just using one at a time.

    Can I just do this?

    Code:
    struct HashTable
    {
    
        int* store; 
        char* taken; 
        int numItems;
        int arrayLength;
        
        struct node{
        int data;
    	char dictword[1024];  
    	struct node* next;
        };
    };
    Edit- NVM it doesn't compile.
    Last edited by osxd00d; 12-03-2007 at 10:48 PM.

  4. #4
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> My experience has been just using one at a time.
    So you are familiar with with a regular link list implementation? Good.

    Now think about having an array of those link lists (of size TableLength).

    gg

  5. #5
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    Quote Originally Posted by Codeplug View Post
    >> My experience has been just using one at a time.
    So you are familiar with with a regular link list implementation? Good.

    Now think about having an array of those link lists (of size TableLength).

    gg
    I apologize if I'm missing the obvious here...I know this is one of the more basic data structures in C.

    My understanding of how to create a linked list is to declare a node....
    Code:
    struct node{
        // Parameters go here;
        struct node* next;  
    };
    Could it be anything like this, or am I on the wrong track?
    Code:
    struct HashTable
    {
    
        int* store; 
        char* taken; 
        int numItems;
        int TableLength;
        struct node* head = NULL;
    };
    After thinking about it, I believe I'd have to make a "head" inside each position in the array, but hopefully that's specific enough someone can guide me.

  6. #6
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> struct node* head = NULL;
    That's pseudo-code for a singly linked list. What you want is an array those things.

    gg

  7. #7
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    Quote Originally Posted by Codeplug View Post
    >> struct node* head = NULL;
    That's pseudo-code for a singly linked list. What you want is an array those things.

    gg
    Right, I get the concept, at least I think I do

    hashtable[0]=word-> word-> word....
    hashtable[1]=word-> word-> word....
    hashtable[2]=word-> word-> word....

    The code however, I'm clueless. I've been sitting here for a few hours trying everything.

  8. #8
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Think of an array in general. Let's say you have a single integer i:
    int i;
    Now you need an array of 50 integers:
    int i[50];

    So for an array of 50 linked lists you'd have:
    struct node *heads[50];

    Then when you need to access the linked list at element n:
    struct node *head = heads[n];

    gg

  9. #9
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    Quote Originally Posted by Codeplug View Post
    Think of an array in general. Let's say you have a single integer i:
    int i;
    Now you need an array of 50 integers:
    int i[50];

    So for an array of 50 linked lists you'd have:
    struct node *heads[50];

    Then when you need to access the linked list at element n:
    struct node *head = heads[n];

    gg
    Thanks. I think this should be enough code to get me going!

  10. #10
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    Ok, I apologize, I'm still lost.

    To create a new hash table
    Code:
    struct HashTable* newHashTable()
    {
        int i;
        struct HashTable* newguy = (struct HashTable*)malloc(sizeof(struct HashTable));
        newguy->store = (int*)calloc(128, sizeof(int));
        newguy->taken = (char*)calloc(128, sizeof(char));
        newguy->numItems = 0;
        newguy->arrayLength = 128;
    
    
        return newguy;
    }
    Now, here's where I insert the data into the array position.
    Code:
    void insert(struct HashTable* h, char dictword)
    {
        int hashValue = hashFunction(h, dictword);
        int position;
        int offset;
        int i;
    
        /* Find a spot to put the new value by quadratic probing */
        position = hashValue;
    
        /* Keep looking for a spot until we find one,
        even if that spot has had lazy deletion done to it. */
        for(i=1; h->taken[position] == FULL; i++)
        {
            /* If we couldn't find a spot to place the
            item make the hash table bigger. */
            if(i > h->arrayLength)
            {
                expandTable(h);
    
                /* After expanding the table, we need to find the updated hash value
                for the item we're inserting, and start the probe over */
                hashValue = hashFunction(h,dictword);
                i=0;
            }
            /* Quadratic Probing */
            offset = i*i;
    
            // Mod by the array length to make sure that we're within the array. */
            position = (hashValue + offset) % h->arrayLength;
        }
    
        h->store[position] = dictword;
        h->taken[position] = FULL;
        h->numItems++;
    
        // If the table has become too full, expand it.
        if((double)h->numItems / (double)h->arrayLength > THRESHOLD)
            expandTable(h);
    }
    Can someone show me where I modify those two in order to get this working with separate chaining instead of quadratic probing?

  11. #11
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Quadratic Probing and Separate Chaining are types of "collision resolution" algorithms. A "collision" is when two distinct keys "hash" to the same position in your index or array. A collision resolution algorithm describes what you do when you have a collision.

    With a probing resolution algorithm, when you hash a key to an already used index (a collision) you then search, or "probe", for an unused index using some formula.

    With a chaining resolution algorithm, you don't have to search for an available index if that index is already in use. Instead, think of the index as a "bucket". Any keys that hash to that bucket simply get thrown in the bucket. A common way of implementing this "bucket" is to use a linked list.

    http://en.wikipedia.org/wiki/Hash_ta...ion_resolution

    So here are some key differences between the two to help you get started.

    - Probing uses a dynamically sized hash table, Chaining uses a fixed sized hash table.
    So you won't need "extendTable()".

    - Probing uses a table of keys, Chaining uses a table of "buckets".
    The bucket in this case is a linked list. So your table will actually be a fixed array of linked lists.

    gg

  12. #12
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    I'm sorry, I understand the theory, but the dynamic scaling is something that my professor requires. I honestly just don't get the code implementation, and I've been searching for the past week to find it. The assignment is due tonight (I almost never post online for help with HW). If there's any way someone could show me how to setup the insert function and the nodes, I think I could finish it. I promise you that there's not a link on the first 5 pages of google results dealing with hash tables and chaining I haven't read. I could draw you a map of where each number goes as it's inserted, but couldn't sit down and code it.

  13. #13
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> but the dynamic scaling is something that my professor requires
    Open hashing techniques typically use a fixed table size. Why do you think dynamic scaling is required?

    Here's a possible HashTable structure:
    Code:
    const int TableLength = 50;
    
    struct HashTable
    {
        int numItems;
        struct node* heads[TableLength];
    };
    What you have here is an array of pointers to node. Each index into 'heads' is a single linked list - which you know how to work with. So your insert and lookup functions will go something like this:
    Code:
    // To Insert key k: 
    LinkedList l = HashTable.heads[Hash(k)];
    LinkedListInsert(l, k);
    
    // To Lookup key k: LinkedListFind(
    LinkedList l = HashTable.heads[Hash(k)];
    LinkedListFind(l, k);
    For further assistance, post the code you've written so far where you're having troubles.

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Dictionary in C# to hash table in C?
    By dinoman in forum C Programming
    Replies: 2
    Last Post: 04-12-2009, 09:23 PM
  2. Writing array, to file
    By zootreeves in forum C Programming
    Replies: 9
    Last Post: 09-08-2007, 05:06 PM
  3. Group Project Help/Volunteer
    By DarkDot in forum C++ Programming
    Replies: 3
    Last Post: 04-24-2007, 11:36 PM
  4. Hash table creation and insertion
    By tgshah in forum C Programming
    Replies: 1
    Last Post: 01-23-2006, 07:54 PM
  5. Not sure on hash table memory allocations
    By Thumper333 in forum C Programming
    Replies: 3
    Last Post: 09-27-2004, 09:00 PM