Hash Table with Chaining

**osxd00d** · 12-03-2007

I'm attempting to write a program for class that takes a dictionary file and reads it into a hash table. Then it will prompt the user for their file that they wish to spell check against the dictionary file. It will read that file word by word, and then spit out the words that were not found in the hash table as spelling errors, giving the word that was misspelled, an the line number it was on.

I currently have the file scanner setup so that it eliminates odd characters, and splits up each line into multiple strings. I've never written a hash table, so I'm a bit lost on how to even begin this project.

Here's basically what I understand.

A hash table is an array (obviously). To figure out where to place each word, I'm going to apply the formula given by the assignment specifications, "hashFunction(w) = (c0*73^(0 mod 4) +c1*73^(1 % 4)+&+cn*73n % 4 ) % [TableLength]"

So I understand in theory that if there's a collision, I'll add a node to a linked list on the front of my chain stored in the corresponding array spot. What I don't understand however, is how to implement the code.

Finally, to keep the has function running quickly, I'll start with an array size of 128. When I get 128 words in the array will need to double in size, and when I get to 256, it would double, and then 512, etc.

Right now I've got it implemented, but without the separate chaining. If anyone could help me get that part, I'd appreciate it.

**Perspective** · 12-03-2007

Make the array element a pointer to the head of the list, then you have a list anchored at each array position. You can google for linked lists to get an idea of how to implement them.

**osxd00d** · 12-03-2007

Originally Posted by Perspective

Make the array element a pointer to the head of the list, then you have a list anchored at each array position. You can google for linked lists to get an idea of how to implement them.

What I don't get though, is how do you dynamically create a linked list? My experience has been just using one at a time.

Can I just do this?

Code:

struct HashTable
{

    int* store; 
    char* taken; 
    int numItems;
    int arrayLength;
    
    struct node{
    int data;
	char dictword[1024];  
	struct node* next;
    };
};

Edit- NVM it doesn't compile.

**Codeplug** · 12-03-2007

>> My experience has been just using one at a time.
So you are familiar with with a regular link list implementation? Good.

Now think about having an array of those link lists (of size TableLength).

gg

**osxd00d** · 12-03-2007

Originally Posted by Codeplug

>> My experience has been just using one at a time.
So you are familiar with with a regular link list implementation? Good.

Now think about having an array of those link lists (of size TableLength).

gg

I apologize if I'm missing the obvious here...I know this is one of the more basic data structures in C.

My understanding of how to create a linked list is to declare a node....

Code:

struct node{
    // Parameters go here;
    struct node* next;  
};

Could it be anything like this, or am I on the wrong track?

Code:

struct HashTable
{

    int* store; 
    char* taken; 
    int numItems;
    int TableLength;
    struct node* head = NULL;
};

After thinking about it, I believe I'd have to make a "head" inside each position in the array, but hopefully that's specific enough someone can guide me.

**Codeplug** · 12-04-2007

>> struct node* head = NULL;
That's pseudo-code for a singly linked list. What you want is an array those things.

gg

**osxd00d** · 12-04-2007

Originally Posted by Codeplug

>> struct node* head = NULL;
That's pseudo-code for a singly linked list. What you want is an array those things.

gg

Right, I get the concept, at least I think I do

hashtable[0]=word-> word-> word....
hashtable[1]=word-> word-> word....
hashtable[2]=word-> word-> word....

The code however, I'm clueless. I've been sitting here for a few hours trying everything.

**Codeplug** · 12-04-2007

Think of an array in general. Let's say you have a single integer i:
int i;
Now you need an array of 50 integers:
int i[50];

So for an array of 50 linked lists you'd have:
struct node *heads[50];

Then when you need to access the linked list at element n:
struct node *head = heads[n];

gg

**osxd00d** · 12-04-2007

Originally Posted by Codeplug

Think of an array in general. Let's say you have a single integer i:
int i;
Now you need an array of 50 integers:
int i[50];

So for an array of 50 linked lists you'd have:
struct node *heads[50];

Then when you need to access the linked list at element n:
struct node *head = heads[n];

gg

Thanks. I think this should be enough code to get me going!

**osxd00d** · 12-04-2007

Ok, I apologize, I'm still lost.

To create a new hash table

Code:

struct HashTable* newHashTable()
{
    int i;
    struct HashTable* newguy = (struct HashTable*)malloc(sizeof(struct HashTable));
    newguy->store = (int*)calloc(128, sizeof(int));
    newguy->taken = (char*)calloc(128, sizeof(char));
    newguy->numItems = 0;
    newguy->arrayLength = 128;


    return newguy;
}

Now, here's where I insert the data into the array position.

Code:

void insert(struct HashTable* h, char dictword)
{
    int hashValue = hashFunction(h, dictword);
    int position;
    int offset;
    int i;

    /* Find a spot to put the new value by quadratic probing */
    position = hashValue;

    /* Keep looking for a spot until we find one,
    even if that spot has had lazy deletion done to it. */
    for(i=1; h->taken[position] == FULL; i++)
    {
        /* If we couldn't find a spot to place the
        item make the hash table bigger. */
        if(i > h->arrayLength)
        {
            expandTable(h);

            /* After expanding the table, we need to find the updated hash value
            for the item we're inserting, and start the probe over */
            hashValue = hashFunction(h,dictword);
            i=0;
        }
        /* Quadratic Probing */
        offset = i*i;

        // Mod by the array length to make sure that we're within the array. */
        position = (hashValue + offset) % h->arrayLength;
    }

    h->store[position] = dictword;
    h->taken[position] = FULL;
    h->numItems++;

    // If the table has become too full, expand it.
    if((double)h->numItems / (double)h->arrayLength > THRESHOLD)
        expandTable(h);
}

Can someone show me where I modify those two in order to get this working with separate chaining instead of quadratic probing?

**Codeplug** · 12-04-2007

Quadratic Probing and Separate Chaining are types of "collision resolution" algorithms. A "collision" is when two distinct keys "hash" to the same position in your index or array. A collision resolution algorithm describes what you do when you have a collision.

With a probing resolution algorithm, when you hash a key to an already used index (a collision) you then search, or "probe", for an unused index using some formula.

With a chaining resolution algorithm, you don't have to search for an available index if that index is already in use. Instead, think of the index as a "bucket". Any keys that hash to that bucket simply get thrown in the bucket. A common way of implementing this "bucket" is to use a linked list.

http://en.wikipedia.org/wiki/Hash_ta...ion_resolution

So here are some key differences between the two to help you get started.

- Probing uses a dynamically sized hash table, Chaining uses a fixed sized hash table.
So you won't need "extendTable()".

- Probing uses a table of keys, Chaining uses a table of "buckets".
The bucket in this case is a linked list. So your table will actually be a fixed array of linked lists.

gg

**osxd00d** · 12-04-2007

I'm sorry, I understand the theory, but the dynamic scaling is something that my professor requires. I honestly just don't get the code implementation, and I've been searching for the past week to find it. The assignment is due tonight (I almost never post online for help with HW). If there's any way someone could show me how to setup the insert function and the nodes, I think I could finish it. I promise you that there's not a link on the first 5 pages of google results dealing with hash tables and chaining I haven't read. I could draw you a map of where each number goes as it's inserted, but couldn't sit down and code it.

**Codeplug** · 12-04-2007

>> but the dynamic scaling is something that my professor requires
Open hashing techniques typically use a fixed table size. Why do you think dynamic scaling is required?

Here's a possible HashTable structure:

Code:

const int TableLength = 50;

struct HashTable
{
    int numItems;
    struct node* heads[TableLength];
};

What you have here is an array of pointers to node. Each index into 'heads' is a single linked list - which you know how to work with. So your insert and lookup functions will go something like this:

Code:

// To Insert key k: 
LinkedList l = HashTable.heads[Hash(k)];
LinkedListInsert(l, k);

// To Lookup key k: LinkedListFind(
LinkedList l = HashTable.heads[Hash(k)];
LinkedListFind(l, k);

For further assistance, post the code you've written so far where you're having troubles.

gg

Thread: Hash Table with Chaining

Thread Tools

Search Thread

Display

Hash Table with Chaining

Similar Threads

Dictionary in C# to hash table in C?

Writing array, to file

Group Project Help/Volunteer

Hash table creation and insertion

Not sure on hash table memory allocations